[arXiv] BigDataFr recommends: Scalable and Accurate Online Feature Selection for Big Data

BigDataFr recommends: Scalable and Accurate Online Feature Selection for Big Data Feature selection is important in many big data applications. There are at least two critical challenges. Firstly, in many applications, the dimensionality is extremely high, in millions, and keeps growing. Secondly, feature selection has to be highly scalable, preferably in an online manner such […]

[Datasciencecentral] BigDataFr recommends: Interview with Gideon Mann, Head of Data Science at Bloomberg #datascientist

BigDataFr recommends: Interview with Gideon Mann, Head of Data Science at Bloomberg Interview with Gideon Mann, Head of Data Science at Bloomberg, where he guides the strategic direction for machine learning, natural language processing, and search on the core terminal. He joined Bloomberg from Google Research. At Google, in addition to academic research, his team […]

[arXiv] BigDataFr recommends: Preconditioned Data Sparsification for Big Data with Applications to PCA and K-means

BigDataFr recommends: Preconditioned Data Sparsification for Big Data with Applications to PCA and K-means Excerpt We analyze a compression scheme for large data sets that randomly keeps a small percentage of the components of each data sample. The benefit is that the output is a sparse matrix and therefore subsequent processing, such as PCA or […]

[arXiv] BigDataFr recommends: Making problems tractable on big data via preprocessing with polylog-size output

BigDataFr recommends: Making problems tractable on big data via preprocessing with polylog-size output To provide a dichotomy between those queries that can be made feasible on big data after appropriate preprocessing and those for which preprocessing does not help, Fan et al. developed the ⊓-tractability theory. This theory provides a formal foundation for understanding the […]

[arXiv] BigDataFr recommends: Big Data Analytics-Enhanced Cloud Computing: Challenges, Architectural Elements, and Future Directions

BigDataFr recommends: Big Data Analytics-Enhanced Cloud Computing: Challenges, Architectural Elements, and Future Directions Excerpt The emergence of cloud computing has made dynamic provisioning of elastic capacity to applications on-demand. Cloud data centers contain thousands of physical servers hosting orders of magnitude more virtual machines that can be allocated on demand to users in a pay-as-you-go […]

[arXiv] BigDataFr recommends: An Extended classification and Comparison of NoSQL Big Data Models

BigDataFr recommends: An Extended classification and Comparison of NoSQL Big Data Models In last few years, the volume of the data has grown manyfold. The data storages have been inundated by various disparate potential data outlets, leading by social media such as Facebook, Twitter, etc. The existing data models are largely unable to illuminate the […]

[arXiv] BigDataFr recommends: Learning to Hash for Indexing Big Data – A Survey

BigDataFr recommends: Learning to Hash for Indexing Big Data – A Survey ‘The explosive growth in big data has attracted much attention in designing efficient indexing and search methods recently. In many critical applications such as large-scale search and pattern matching, finding the nearest neighbors to a query is a fundamental research problem. However, the […]

[arxiv] BIgDataFr recommends: Train faster, generalize better – Stability of stochastic gradient descent #datascientist

BigDataFr recommends: Train faster, generalize better – Stability of stochastic gradient descent ‘We show that any model trained by a stochastic gradient method with few iterations has vanishing generalization error. We prove this by showing the method is algorithmically stable in the sense of Bousquet and Elisseeff. Our analysis only employs elementary tools from convex […]

[arXiv] BigDataFr recommends: Empirical Big Data Research- A Systematic Literature Mapping #machinelearning

BigDataFr recommends: Empirical Big Data Research- A Systematic Literature Mapping « Background: Big Data is a relatively new field of research and technology, and literature reports a wide variety of concepts labeled with Big Data. The maturity of a research field can be measured in the number of publications containing empirical results. In this paper we […]