Beyond the Data Deluge
Reading the document which is on a category that I don’t really follow, I have come to understand what is being discussed.
As research is progressing with time so is the amount of data we get back from this research . As said in the excerpt by Hey and Trefethen (2003) “Today, some areas of science are facing hundred- to thousandfold increases in data volumes from satellites, telescopes, highthroughput instruments, sensor networks, accelerators, and supercomputers, compared to the volumes generated only a decade ago”
Even though we have a lot of storage available it’s trying to find a good location to store this information that allows researchers to access it and filter it with ease.
Why not use a clustered database system? Well, this is where we face the issue. With the amount of disk space new research takes up we find that our limitations are no longer space, but read-write speeds (access times).
Improvements have been made to new databases, for example “Grayis article Wulf won the Storage Challenge at the SC08 conference by executing a query on the Sloan Digital Sky Survey (SDSS) database in 12 minutes; the same task took 13 days on a traditional (nonparallel) database system”, quoted from Szalay (2009)
This article was published by AAAS on the 6th of March 2009. (The American Association for the Advancement of Science )
The American Association for the Advancement of Science is well known for their articles and journals, and hold a respected name in the community and they reference every bit of information within the article that they have written. So I think they would be right on the reliable side. It would help if I was familiar with the information provided and the references to back this up, but from my little knowledge, they seem credible.
AAAS has a site dedicated to their articles and journals. https://www.aaas.org/page/featured-articles
In google scholar, Data Deluge gets approximately 80 thousand results with the top result being cited 1363 times and this one being cited 483 times.
This article is about the evolution of data and analytics and how it has changed over time. It talks about the two main stages of data which is BBD (Before big data) and ABD (After big data).
It also discussed how analytics has gone through three main stages, Analytics 1.0, 2.0 and 3.0. To quickly explain the stages of analytics is going from finding where someone is from, to finding out who they are, what they like, and what to suggest to them. Websites are using search algorithms, recommendations from peer groups, product suggestions and ads targeted at specific audiences, that are being driven by analytics based in a huge amount of data, to keep their customers coming back. (Davenport, 2013)
Harvard Business Review – December 2013 Issue. This article can also be found online.
Harvard Business Review is a peer-reviewed journal associated with Harvard Business School and is available online and in physical print. As the journal is peer reviewed it makes it a credible source for a wide range of information on management, business, and various other industries. Being associated with Harvard, I would consider the information in the journal to be compiled by some of the best in the field.
Thomas Davenport is well established in American academics, and has published over 100 pieces of literature including books and articles. I would consider him to be a very credible source on the subject of analytics, as he is the Director of Research at the International Institute for Analytics, amongst other roles in the area of analytics and information management.
Thomas Davenport has written, coauthored, or edited sixteen books, as well as contributing to over 100 articles published by the Harvard Business Review, The Financial Times, and many other publications. His publications can be found online by searching for Thomas H. Davenport. He specialises in articles and literature on analytics, business process information, and knowledge management.
Harvard Business Review have their own website, which includes an archive of previous magazine issues, which can be found at https://hbr.org/
Thomas Davenport has also co-authored an article in the Harvard Business Journal called Data Scientist: The Sexiest Job of the 21st Century, which can be found at https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century
Looking at big data and analytics in Google Scholar I get about 2 million results. This article on Google Scholar has 167 citations.
Should Microsoft be your AI learning platform?
This blog post is about the Azure and its PAAS (Platform As A Service). It talks about how the Cortana Intelligence suite offers an easy to use service for anyone interested in AI. It compares Azure to AWS, and describes how AWS is “more DIY” than Azure, and requires more foundation creation, whereas Azure gives you the foundation (Heath, 2016). The author also references how Rolls Royce use Azure AI to assist in their business, such as engine maintenance and service. (Heath, 2016)
This information was sourced from a blog article on ZDNet, written by Nick Heath, which can be found at http://www.zdnet.com/article/should-microsoft-be-your-ai-and-machine-learning-platform/
I would consider this article to be a biased opinion on which cloud vendor offers the best AI Suite. Opinion pieces are good, but the problem with them is that the opinion of one person, doesn’t make them correct. This blog post is not correctly referenced, and uses a very specific example which means the information cannot be generalised when considering other cloud vendors.
The author himself is chief reporter for Tech Rebuplic, an online news source, and writes mainly about technology aimed at decision makers in the IT industry. He has written many other tech-related articles on subjects such as the Raspberry Pi, operating systems, and smartphones. He is also a senior reporter for ZDNet writing about similar topics, so I would assume he has a reasonably credible knowledge base, although seemingly biased towards certain topics.
Nick Heath has written a number of articles for Tech Rebuplic and ZDNet, including an article about everything you need to know about AI, which can be found at ZDNet here http://www.zdnet.com/article/what-is-ai-everything-you-need-to-know-about-artificial-intelligence/
This blog post is not referenced in Google Scholar, which is understandable as it’s a Tech Blog and not a peer-reviewed journal. The fact that it has no appearance in Google Scholar does not help its credibility.
Hey, A. and Trefethen, A. (2003). In: F. Berman, G. Fox and T. Hey, ed., Grid Computing: Making the Global Infrastructure a Reality. Chichester, UK: Wiley, pp.809-824.
Szalay, A. (2009). GrayWulf: Scalable Clustered Architecture for Data-Intensive Computing. In: 42nd Hawaii International Conference on System Sciences. [online] Hawaii, Paper 720. Available at: http://research.microsoft.com/apps/pubs/default. aspx?id=79429 [Accessed 30 Mar. 2018].
Davenport, T. (2013). Analytics 3.0. Harvard Business Review. [online] Available at: https://hbr.org/2013/12/analytics-30 [Accessed 30 Mar. 2018].
Heath, N. (2016). Should Microsoft be your AI and machine learning platform?. [Blog] ZD Net. Available at: http://www.zdnet.com/article/should-microsoft-be-your-ai-and-machine-learning-platform/ [Accessed 30 Mar. 2018].