Tag: hadoop

  • Learn Apache Spark this Summer with edX

    edX has just announced a new series of Big Data courses. The series consists of 2 courses focused around Apache Spark. If you are not familiar with Spark, it is a very fast engine for large-scale data processing. It claims to perform up to 100 times faster than hadoop. Here are the 2 courses: Introduction…

  • Strata 2015

    The annual Strata Conference in California is this week. The workshops have already started, but the conference does not begin until Thursday, February 19, 2015. At that time, a number of the keynote speeches will be live streamed for free. The keynotes are always great so be sure to tune in.

  • Strata/Hadoop World 2014 Live Stream Starting Soon

    Strata + Hadoop World 2014 is currently going on in New York City this week. Some of the keynotes will be live streamed this morning. The live streaming starts at 8:45 Eastern Standard Time. Also, keynotes will be live streamed tomorrow (Oct. 17, 2014) as well. The keynotes are always great, and the line-up this…

  • What is a “Data Lake”?

    I have frequently been hearing the term data lake. Being the curious person that I am, I decided to go in search of a definition. Currently, the company Pivotal is responsible for marketing the term. However, I believe the term was originally coined by Dan Woods of CITO Research back in 2011. Anyhow, here is…

  • International School of Engineering Programs Beginning Soon

    I recently received the following information. International School of Engineering is announcing their 3rd batch of live e-Learning certificate programs starting 4-Sep-2013 in “Engineering Big Data with R and Hadoop Ecosystem” and “Essentials of Applied Predictive Analytics” (http://goo.gl/kHckP). These programs helped Engineers and Managers transform into Hadoop Developers/Data Scientists, get industry certifications, revolutionize their workspace…

  • Confused on Hadoop? This link will help.

    Are you confused on what hadoop is? What about Hbase, Pig Hive? Well, this link will help you out. Hadoop Toolbox: When to Use What | SmartData Collective. It provides a nice short explaination for the following terms: Hadoop Hbase Hive Pig Sqoop Oozie Flume/Chukwa Avro

  • Hadoop World/Strata NYC 2012 Videos

    The videos for the Hadoop World/Strata 2012 Conference in New York City are posted on Youtube. Enjoy some video viewing experience. I may be posting some of my favorites in the coming days.

  • Hadoop World/Strata Conference

    The 2012 edition of Hadoop World and Strata Conference is underway. The conference is in New York City and if you are not lucky enough to attend, then at least you can watch the live video feed.

  • Big Data E-Learning Programs

    The International School of Engineering is launching 2 new online programs.  Both are certificate programs and last 7 or 8 weeks. Both programs just started this week. Engineering Big Data with R and Hadoop Ecosystem  This program will cover Hadoop, R, map reduce and NoSQL Essential Predictive Analytic Techniques: What Every Aspiring Data Scientist Must…

  • Twitter, NoSQL and Data Analysis

    This is a lengthy but very good slide deck on the what/why of the tools used at Twitter. Note: The slide deck is about 2 years old. NoSQL at Twitter (NoSQL EU 2010) View more presentations from Kevin Weil

  • A Data Science Curriculum

    This is not intended to be mapped to a set of college courses. It is intended to be a listing of necessary skills for a data scientist. For a definition of data scientist, see this previous post. Mathematics Calculus – not directly important to data science, but the knowledge is important to understand the statistics…

  • Python For Big Data

    Travis Oliphant, the CEO of Continuum Analytics gives a nice presentation on Python and BigData. He argues that python is frequently used in bigdata, but it does not get a lot of attention from the bigdata community. The bigdata community only wants to talk about hadoop. Travis would like to see python have a larger…