A few years back, the Harvard Business Review described the commercial role of a data scientist as “the sexiest job of the 21st century”, instantaneously promoting it as a buzzphrase. More so, according to The Economist, as of 2017 “the world’s most valuable resource is no longer oil, but data”.

Data science is an interdisciplinary field which deals with many aspects of data in order to gain some measurable insight. The purpose of a data science exercise is,  typically, to build models which either explain some phenomenon (by offering data-based evidence), or for prediction and recognition (mostly using past data to foretell some future activity). Data is the main ingredient fed into computer models. Take the Controlled Vehicular Access system in place in Valletta; this uses software which is able to recognise your car’s number plate via cameras and charge you for access and parking in our capital city. This software ‘learns’ by showing it images of many characters covering all the letters and numbers, possibly from various angles and under a range of weather conditions.

A data science project starts with data collection, where data is collected either manually through surveys, or in an automated way via sensors. After collection, data is typically cleaned and prepared for use in the analytical part of the data science process. After cleaning, data is stored in appropriate formats on computer storage for later consumption.

Sometimes the data is so large it needs to be stored on a network of computers, and mechanisms to speed up access to data must be implemented. Data is then analysed using statistical techniques. Visualisations are important – it is true that a picture is worth a thousand words. Depending on the end goal of the data science project, data may then be used to build computational models which fulfill some business, engineering, or scientific goal.

Data science is not a new field, but rather an evolution of traditional data-related fields (e.g. analytics or business intelligence). Data science has gained momentum in recent years and differentiates from other classical data fields for two main reasons.

First, the staggering volume of data being generated (often referred to as Big Data). A mind-boggling trivia is that in the last two years we generated 90 per cent of all data in the world. Impressively, this has held true for the past 30 years. Data science allows us to build new tools and algorithms to process, store and analyse this data. Second, given the volume of data, it is not possible to investigate it manually, and we require a certain degree of automation – where models and visualisations are built automatically based on the latest data. This requires solid software engineering skills.

The data scientist’s role is a technical one, well-suited to individuals with interdisciplinary training. Data scientists make use of statistics, mathematics, computer programming, distributed computing, machine learning and human-computer interfaces.

Applications of data science can be found in all domains including finance, health, earth observation, energy, traffic management, telecommunications, logistics, education, manufacturing, agriculture, iGaming etc. Each of these areas has an endless supply of problems where data science can be applied successfully.

For example, in the iGaming industry, popular applications of data science include fraud detection, predicting customer lifetime value, predicting customer attrition and game recommendation systems. Both locally and abroad, there is a growing and varied demand for trained data scientists.

The University of Malta has recently launched a Data Science Research Platform, which also acts as a collaboration vehicle between industry and academia for data science projects (get in touch on dsrp.research@um.edu.mt). Dr Jean-Paul Ebejer is a member of the Data Science Research Platform and a lecturer at the University of Malta. The Faculty of ICT offers a number of undergraduate and postgraduate courses in related subjects.

Did you know?

• The first electronic computer ENIAC weighed more than 27 tons and took up 1,800 square feet.

• The first computer mouse was made with wood in 1964 by Doug Engelbart.

• HP, Microsoft and Apple were all started in a garage.

• The original name of the Windows operating system was Interface Manager.

• Do you know what CAPTCHA stands for? ‘Completely Automated Public Turing test to tell Computers and Humans Apart’.

• Symbolics.com was the first domain name ever registered online. It was registered on March 15, 1985.

For more trivia see: www.um.edu.mt/think

Sound bites

• More and more processes are being automated. Self-driving delivery vehicles are finding their way into many areas. However, an interdisciplinary research team has observed that cooperation between humans and machines can work much better than just human or robot teams alone.

https://www.sciencedaily.com/releases/2019/05/190524113529.htm

• Computer scientists have taught an artificial intelligence agent how to do something that usually only humans can do – take a few quick glimpses around and infer its whole environment, a skill necessary for the development of effective search-and-rescue robots that one day can improve the effectiveness of dangerous missions.

https://www.sciencedaily.com/releases/2019/05/190515144017.htm

For more soundbites listen to Radio Mocha: Mondays at 7pm on Radju Malta and Thursdays at 4pm on Radju Malta 2 https://www.fb.com/RadioMochaMalta/

Sign up to our free newsletters

Get the best updates straight to your inbox:
Please select at least one mailing list.

You can unsubscribe at any time by clicking the link in the footer of our emails. We use Mailchimp as our marketing platform. By subscribing, you acknowledge that your information will be transferred to Mailchimp for processing.