# CS8091- BIG DATA ANALYTICS Syllabus 2017 Regulation

CS8091Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  BIG DATA ANALYTICSÂ  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  L T P C
OBJECTIVES:

• To know the fundamental concepts of big data and analytics.
• To explore tools and practices for working with big data
• To learn about stream computing.
• To know about the research that requires the integration of large amounts of data.

## UNIT I INTRODUCTION TO BIG DATAÂ  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  9

Evolution of Big data – Best Practices for Big data Analytics – Big data characteristics – Validating – The Promotion of the Value of Big Data – Big Data Use Cases- Characteristics of Big Data Applications – Perception and Quantification of Value -Understanding Big Data Storage – A General Overview of High-Performance Architecture – HDFS – MapReduce and YARN – Map Reduce Programming Model

## UNIT II CLUSTERING AND CLASSIFICATIONÂ  Â  Â  Â  Â  Â  Â  Â 9

Advanced Analytical Theory and Methods: Overview of Clustering – K-means – Use Cases – Overview of the Method – Determining the Number of Clusters – Diagnostics – Reasons to Choose and Cautions .- Classification: Decision Trees – Overview of a Decision Tree – The General Algorithm – Decision Tree Algorithms – Evaluating a Decision Tree – Decision Trees in R – NaÃ¯ve Bayes – Bayesâ€˜ Theorem – NaÃ¯ve Bayes Classifier.

## UNIT III ASSOCIATION AND RECOMMENDATION SYSTEMÂ  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 9

Advanced Analytical Theory and Methods: Association Rules – Overview – Apriori Algorithm – Evaluation of Candidate Rules – Applications of Association Rules – Finding Association& finding similarity – Recommendation System: Collaborative Recommendation- Content Based Recommendation – Knowledge Based Recommendation- Hybrid Recommendation Approaches.

## UNIT IV STREAM MEMORYÂ  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  9

Introduction to Streams Concepts â€“ Stream Data Model and Architecture – Stream Computing, Sampling Data in a Stream â€“ Filtering Streams â€“ Counting Distinct Elements in a Stream â€“ Estimating moments â€“ Counting oneness in a Window â€“ Decaying Window â€“ Real time Analytics Platform(RTAP) applications – Case Studies – Real Time Sentiment Analysis, Stock Market Predictions. Using Graph Analytics for Big Data: Graph Analytics

## UNIT V NOSQL DATA MANAGEMENT FOR BIG DATA AND VISUALIZATIONÂ  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  9

NoSQL Databases : Schema-less Modelsâ€–: Increasing Flexibility for Data Manipulation-Key Value Stores- Document Stores – Tabular Stores – Object Data Stores – Graph Databases Hive – Sharding â€“- Hbase â€“ Analyzing big data with twitter – Big data for E-Commerce Big data for blogs – Review of Basic Data Analytic Methods using R.

OUTCOMES:

Upon completion of the course, the students will be able to:

• Work with big data tools and its analysis techniques
• Analyze data by utilizing clustering and classification algorithms
• Learn and apply different mining algorithms and recommendation systems for large volumes of data
• Perform analytics on data streams
• Learn NoSQL databases and management.

TEXT BOOKS:

1. Anand Rajaraman and Jeffrey David Ullman, “Mining of Massive Datasets”, Cambridge University Press, 2012.
2. David Loshin, “Big Data Analytics: From Strategic Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph”, Morgan Kaufmann/El sevier Publishers, 2013.

REFERENCES:

1. EMC Education Services, “Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data”, Wiley publishers, 2015.
2. Bart Baesens, “Analytics in a Big Data World: The Essential Guide to Data Science and its Applications”, Wiley Publishers, 2015.
3. Dietmar Jannach and Markus Zanker, “Recommender Systems: An Introduction”, Cambridge University Press, 2010.
4. Kim H. Pries and Robert Dunnigan, “Big Data Analytics: A Practical Guide for Managers ” CRC Press, 2015.
5. Jimmy Lin and Chris Dyer, “Data-Intensive Text Processing with MapReduce”, Synthesis Lectures on Human Language Technologies, Vol. 3, No. 1, Pages 1-177, Morgan Claypool publishers, 2010.