<iframe src="https://www.googletagmanager.com/ns.html?id=GTM-KVGHS6G" height="0" width="0" style="display:none;visibility:hidden"></iframe>
我要报名
适用人群
Mainly aimed at students with a science and engineering background who hope to engage in data science-related work in the future. By completing typical spark projects in the industry, students can master the skills and thinking methods required for the next data science project in the big data era, greatly improving their competitiveness in applications and job hunting.
你将收获
Seleted Icon
完成一个大数据机器学习项目
Seleted Icon
掌握spark的基本操作和核心逻辑
Seleted Icon
北美求职的竞争力
Seleted Icon
甲骨文导师提供的推荐信
你将学习的技能
spark数据处理
spark文本处理和机器学习
数据分析
课题介绍
In recent years, an unprecedented big data revolution has taken place around the world. In an Internet company with more than 100 million daily active users, the common pandas and scikit-learn can no longer meet daily needs, and spark has become the most popular big data processing framework. The industry has an increasing demand for spark talents, but spark is not common in school courses. This course will start with basic concepts and lead students step by step to complete the first complete spark project. At the same time, we will also share the most commonly used operating techniques and ways of thinking in the industry, so that you can better understand the core logic of processing big data.
课程大纲
The total duration of the bootcamp is four weeks. In the first week, the instructor communicates with the students online, introduces basic concepts, and learns Spark data processing; in the second week, they learn Spark text processing and machine learning related operations, and complete a simple sentiment analysis model. In the last two weeks, you will apply the knowledge you have learned and complete a practical project using spark. This project is based on Yelp data and uses both user data and review text to build a fake review detection system
1
Introduction to Big Data & Spark
考察:spark基本概念
授课1小时; 学习1小时
2
Data manipulatio
考察:使用spark处理结构化数据
授课2小时; 学习4小时
3
Spark Machine Learning and NLP
考察:使用spark处理文本数据,完成一个情感分析模型
授课3小时; 学习6小时
4
Project – Build a spark-based end to end anti-fraud system using Yelp dataset
考察:spark实战经验
授课5小时; 学习10小时
5
Spark FAQ in interview, Advanced spark topics, Summary
考察:Q&A
授课1小时; 学习1小时
课程导师