A Primer on AI/ML/DL/NN etc.

Today, many of us non-technical people feel quite left out of conversations that are buzzing around in companies, social media, webinars, presentations, etc.

Yes – I am talking about the most talked about acronyms – Artificial Intelligence (AI), Machine Learning (ML), Deep Learning (DL), Neural Networks (NN) and so on that also includes Big Data, Statistical methods, Data Science, Predictive Analytics and so forth.

My attempt to facilitate understanding of the basics.

WHAT IS ARTIFICIAL INTELLIGENCE (AI)?

  1. Artificial intelligence (AI) is the simulation of human intelligence processes by machines, especially computer systems. If a system or a device can do “smart” things like humans do, then it is said to be artificially intelligent.
  2. It is an umbrella concept that includes image processing, natural language processing, robotic process automation, machine learning, neural networks and many more.
  3. There is a wrong impression that AI is a system, but it is implemented in a system. Particular applications of AI include expert systemsspeech recognition (Natural Language Processing (NLP) and machine vision.
  4. These processes include learning (the acquisition of information and rules for using the information), reasoning (using rules to reach approximate or definite conclusions) and self-correction.

WHAT IS MACHINE LEARNING (ML)?

  1. To put it very simply, machine learning is defined as “the ability (for computers) to learn without being explicitly programmed.” Machine Learning deals with making your computers (or machines) learn from external environment data being provided – like connections to sensors, electronic components in devices, storage devices, etc. It also crunches huge input data sets that are provided to it to come up with patterns and predictions – like Amazon suggesting what your buying preferences are or Netflix offering options based on your previous viewing history, etc.
  2. Machine Learning is simply a way of achieving Artificial Intelligence. The main objective of ML is to allow the computers to learn automatically without human intervention, assistance or programming and adjust actions accordingly.
  3. ML builds models and inbuilt algorithms that it keeps constantly updating and fine- tuning based on what inputs you provide on an on-going basis.
  4. Machine learning enables analysis of massive quantities of data.

WHAT IS DEEP LEARNING (DL)?

  1. Deep learning is a specialized form of machine learning – for example – a machine learning starts with relevant features being manually extracted from images. The features are then used to create a model that categorizes the objects in the image.
  2. Whereas with a deep learning approach, relevant features are automatically extracted from images. In addition, deep learning performs “end-to-end learning” – where a network is given raw data and a task to perform, such as classification, and it learns how to do this automatically.
  3. Deep Learning is also sometimes referred to as “Artificial Neural Network”. Another key difference is deep learning algorithms scale with data, they often continue to improve as the size of your data increases.
  4. Deep learning is applied in many areas of artificial intelligence such as speech recognition, image recognition, natural language processing, robot navigation systems, self-driving cars etc. Some examples that we see in our daily lives are virtual assistants like Alexa, Siri, Cortana, driverless trucks, drones and automated cars, automatic machine translation, Character text generation, facial recognition, behavioural analysis, etc.
  5. Big Data is required for Deep Learning. Massive data is to be fed into models – however the bottleneck remains in cleansing and processing the data into the required format for powering the DL models.

WHAT ARE NEURAL NETWORKS?

  1. A neural network is a type of machine learning which models itself after the human brain. Neural networks with their deep learning cannot be programmed directly for the task. Rather, they have the requirement, just like a child’s developing brain, that they need to learn the information.
  2. They have become important and standard tools for data mining. Neural network is an adaptive system that changes its structure on external or internal information that flows through the network during the learning phase.
  3. A neural network usually involves a large number of processors operating in parallel and arranged in tiers. The first tier receives the raw input information — analogous to optic nerves in human visual processing. Each successive tier receives the output from the tier preceding it, rather than from the raw input — in the same way neurons further from the optic nerve receive signals from those closer to it. The last tier produces the output of the system.
  1. Handwriting recognition is an example of a real-world problem that can be approached via an artificial neural network. The challenge is that humans can recognize handwriting with simple intuition, but the challenge for computers is each person’s handwriting is unique, with different styles, and even different spacing between letters, making it difficult to recognize consistently. Handwriting recognition has various applications, as varied as automated address reading on letters at the postal service, authorization signatures on documents, reducing bank fraud on checks, etc.
  1. Technology uses have expanded to many more areas such as chatbots, stock market prediction, delivery route planning and optimization, drug discovery and development and many more.

WHAT IS DESCRIPTIVE, PREDICTIVE AND PRESCRIPTIVE ANALYTICS?

  1. Descriptive – based on insights into historical data – What has happened?
  2. Predictive – based on statistical tools and forecasting techniques to answer – What could happen?
  3. Prescriptive – use simulation and optimization algorithms to advise on possible outcomes and answer – what should be done?

WHAT IS DATA SCIENCE AND WHAT CAN YOU DO WITH IT?

  1. Data Science is a study which deals with identification, representation and extraction of meaningful information from data sources.
  2. Some of the tasks you can do with Data Science include: Coming up with conclusive research and open-ended questions, extracting large volumes of data from external and internal sources, deploying statistical, machine learning and analytical methods, clean, prune and get data ready for processing and analysis, looking at data from various angles to determine hidden patterns, relations and trends, etc.
  3. If you are wondering what is the difference between Data Analyst and a Data Scientist, there are quite apart from the goal or objective with which they work. A Data Analyst starts by aggregating, querying and mining data for reporting on various functions. A Data Scientist starts by asking the right questions and therefore the Data Scientist needs substantive expertise and non-technical skills.

Analytics for fraud investigations

Many have wondered why one would perform analytics for fraud detection (or prevention) in good times (business as usual) and why would you when there is no whistle blown about a fraud suspicion?

Is this not a grey area where people sensitivities are involved and news about investigations can affect the organization’s brand image? Being trolled over social media that becomes painful to counter? But the CFO’s office is the hardest hit when it comes to answering the Board on the financial losses incurred due to fraudulent activities that leaves a gaping hole in finances.

Traditional anomaly detection is conducted routinely by internal or external auditors. But they are insufficient, not backed by powerful tools and the objective and terms of reference for these audits limit the investigation to a certain level and no more.

Often referred to as “Forensic Audit”, fraud detection methods assume great significance because it requires digging deeper than normal audit to examine and investigate internal control failures, conflict of interest, social networks, multiple factors such as behavioural analysis and ability to crunch big data that can extend / expand beyond the time period under the lens.

A prudent and practical approach would be to set up a mechanism that can proactively provide analytics and flag off high risk areas that need immediate attention.

Fraud Analytics is the use of analytical technology with intelligent business rules and techniques, which will help detect improper transactions like bribery, favouritism, working capital leakage, asset misappropriation, etc. either before or after the transaction is done, so that appropriate steps can be taken to prevent further damage.

Fraud Analytics also helps in performance measurement, evaluate internal control failures and deficiencies, standardize and help in constant improvement that would benefit the overall organization and governance.

Fraud perpetrators use a lot of different and unique techniques which are randomized to prevent discovery and therefore, the techniques used for detection has to be one or many of the following:

  1. Capable of running automated business rules that throw up anomalies that can be further investigated for false / true positives.
  2. Calculation of various statistical parameters like averages (for example average number of calls made, emails exchanged, delays in bill payments, etc.), quantities (for example comparison of total quantities ordered / received / invoiced / returned), performance metrics (e.g. attrition rate pattern amongst certain departments, sales returns peaking immediately after monthly close, etc.), user profiles (e.g., interested party contracts, sudden lifestyle changes by the user, behavioural patterns noticed) etc.
  3. Trend analysis using time series distribution.
  4. Clustering and classification that can help find patterns and associations within data sets.
  5. Algorithms, models and probability distributions of various business activities.
  6. Machine learning and neural networks to automatically identify characteristics of fraud and used later with increasing Big data inputs.

Having a Fraud Prevention program for controlling fraud risks is an important part of Enterprise Risk Management and provides your investors, partners and auditors with more confidence on your demonstrated ability to tackle the same in a sustained manner and not on an ad-hoc basis.