sitemap | recent | contact |

Hillhacks

Spark - How to do cluster computing at scale

A Talk by Hitesh,

Abstract

An introduction to spark. Talking about this wonderful framework to ingest large amounts of data via very simple primitives

Main Description

Spark is a fast and general engine for large-scale data processing. We are going to be looking at how Spark achieves the degree of parallelism and simplicity. In particular, we shall go through the data processing primitives in Spark and see toy examples that demonstrate usage of Spark. At the end of the talk, the audience will have a basic understanding of Spark and know-how on processing large quantities of data in a simplistic manner

Speaker

I am a security researcher working in the areas of Machine Learning, Big data and network security. In my spare time i like to fiddle with code to make networks and people a little more safer from malicious activity.

Github - https://github.com/hiteshd

LinkedIn - https://www.linkedin.com/profile/view?id=167359555

Homepage - http://mason.gmu.edu/~hdharmda/

Blogs - https://www.fireeye.com/blog/threat-research.html/category/etc/tags/fireeye-blog-authors/cap-hitesh-dharmdasani

sign up | Log In