The days of Hadoop are numbered?

elephant walking away e1341677481803 The days of Hadoop are numbered?

An interesting take on the latest ‘Big Data’ technology.

Using network visualisation to understand the best football teams

Soccer%20network Using network visualisation to understand the best football teams

Chicks Rule? – Gender Balance on Social Networking sites

From ‘Information is Beautiful

chicksrule Chicks Rule?   Gender Balance on Social Networking sites

The Next Generation Search, from Google

Introducing the Knowledge Graph: things, not strings

taj%2Bmahal The Next Generation Search, from Google

Here is the video if you don’t want to read the blog:


Big Data & Provenance: Operations, machine learning and premature babies

photo mikel m Big Data & Provenance: Operations, machine learning and premature babies

A eye opening discussion on find the ‘interesting pattern’ being ‘no pattern’. It also touched on ‘big data’ and something I would call ‘provenance’.

Big Data’s Big Problem: Little Talent (The Wall Street Journal)

Another good piece on the importance of ‘big data’ and urgent need of people know how to work with them (Spotted by William). It has a nice video. too.

big data wall street journal Big Datas Big Problem: Little Talent (The Wall Street Journal)

Designing great data products

0312 1 drivetrain approach lg Designing great data products

http://oreil.ly/HhD4Vk

 

Another piece showing why viusal analytics is important to the ‘big data’

photo timothy m Another piece showing why viusal analytics is important to the big data

http://oreil.ly/H79DlD

Steve O’Grady (@sogrady) , a developer-focused analyst from RedMonk, views large-scale data collection and aggregation as a problem that has largely been solved. The tools and techniques required for the Googles and Facebooks of the world to handle what he calls “datasets of extraordinary sizes” have matured. In O’Grady’s analysis, what hasn’t matured are methods for teasing meaning of this data that are accessible to “ordinary users.”

  • O’Grady on the challenge of big data: ”Kevin Weil (@kevinweil) from Twitter put it pretty well, saying that it’s hard to ask the right question. One of the implications of that statement is that even if we had perfect access to perfect data, it’s very difficult to determine what you would want to ask, how you would want to ask it. More importantly, once you get that answer, what are the questions that derive from that?”
  • O’Grady on the scarcity of data scientists: ”The difficulty for basically every business on the planet is that there just aren’t many of these people. This is, at present anyhow, a relatively rare skill set and therefore one that the market tends to place a pretty hefty premium on.”
  • O’Grady on the reasons for using NoSQL: ”If you are going down the NoSQL route for the sake of going down the NoSQL route, that’s the wrong way to do things. You’re likely to end up with a solution that may not even improve things. It may actively harm your production process moving forward because you didn’t implement it for the right reasons in the first place.”

Social Media Visual Analytics Job at Yahoo! Research Spain

Seems to be exactly what we are doing here:

The original ad on infovis-wiki.net

Yahoo! Labs in Barcelona (Spain) has several openings for postdoc positions in data mining, information retrieval, HCI, and visualization with emphasis in the areas of social-network analysis and topic modeling, within the context of EU-funded projects.

Yahoo! Labs is pioneering the new sciences underlying the web. As the center of scientific excellence for Yahoo!, Yahoo! Labs delivers both fundamental and applied scientific leadership through published research and new technologies powering the company’s products.

Selected candidates will be working together with our scientists to develop novel algorithms for processing and mining extremely large amounts of data coming from user action logs and social networks. Successful candidates should have demonstrated their ability to perform high-quality original research in the areas of data mining, social-network and social-media analysis, information-propagation analysis, social-influence analysis, and/or topic modeling. Candidates should also be proficient in developing proof-of-concept prototypes and experimenting with novel algorithms in a variety of platforms for processing very large data sets.

Required skills/qualifications:

  • Ph.D. degree in Computer Science;
  • Proficiency in both written and spoken English;
  • Strong problem-solving and analytical skills;
  • Proven ability to perform high-quality original research in any of the areas above (data mining, information retrieval, HCI, visualization) both independently and as a member of a team;
  • Experience in research and/or development of applications in the areas of social networks and/or social media and/or topic modeling;
  • Proven ability to develop proof-of-concept prototypes for experimenting with novel algorithms;
  • Experience in distributed processing of large data sets (Hadoop/Pig a plus)
  • Proficiency in at least one of Python, Perl, C/C++, Java;
  • Experience with UNIX systems;

An interesting paper on “Interactive Dynamics for Visual Analysis”

Published in ‘Communications of the ACM’ By the two big names in Visualisation, young and senior.

Heer Shneiderman (2012) Interactive dynamics for visual analysis

Spotted by Wiliam.