Wednesday, May 10, 2017

Apache Spark Design Patterns Using Scala apache spark Series 1 The word count

Apache Spark Design Patterns Using Scala apache spark Series 1 The word count


A simple word count using scala in Spark
Simple word count example - Click to see code
There are many limitations in the above code The objective is to count words in the post, however the Posts.xml has lot of meta-data like OwnerUserId,Title,Tags etc..The info we need is in the Body.
The missing logic is
1) Count words in the Body
2) Error handling
3) Data clean up - we don�t count single quotes, special characters This example uses case classes and xml parsing which in in-built Scala.
Enhanced word count example - Click to see code
get
 

Copyright © Video game tester Design by Free CSS Templates | Blogger Theme by BTDesigner | Powered by Blogger