Algorithms for Big Data

This is a 1 day training course for those who want to gain a detailed insight on Algorithms for Big Data.

Real world applications are moving from being computationally-bound to being data-bound, for which we are seeing a heterogeneous large datasets. This course will provide an introduction to algorithm design on such large datasets, with specific focus on integrating information from multiple datasets. This course will be based on one or more recent publications.


Duration: 1 Day


Audience: This course is for CTO's, IT Managers, and Hadoop developers


Prerequisites: Big Data such as Hadoop and MapReduce with a fair amount of mathematical maturity.


Price: We do offer discounts, so please email us at contactus@taleresearch.com subject: Training



Skills Taught

Sampling

  • Reservoir Sampling
  • Sampling from Databases
  • Monte Carlo Estimation
  • Coupling and Mixing Times

Streaming Algorithms
  • Filtering & Estimating the number of distinct
  • Frequency Moments
  • Online aggregation

Parallel Architectures
  • Map Reduce basics
  • Evaluating Joins on Map Reduce
  • Graphs on Map Reduce
  • Feed Following

Clustering and Deduplication
  • Clustering
  • Deduplication
  • Correlation Clustering
  • Deduplication in Linked Data
  • Active Learning

Class Structure

This class is Theory based and presented with slides and sample projects.


Medium

Classroom Only


Words of Encouragement

Tale Research can assist you in laying the groundwork for your personal journey to encourage you to become a world-class big data resource to your customers, colleagues and company, by providing you with the appropriate skills and accreditation needed to succeed.