data

2023 Aug 17

Notes on machine learning, part 1: What is it

This is the first part of a series (that’s the intention at least 🤣) trying to give an overview, from a broad lens, of what I do in my academic field. It’s machine learning - ML¹ - and in this I’ll focus on what it is, while I’ll talk about its applications, research and development in later parts. Keep in mind also that this is the first time I write extensively about what I do in general terms and in a popular format! I’ll explain mainly through examples to try and avoid as possible any formal definitions. I’ll try my best 🙃 and please send feedback, it’ll help!!

ML can help in many fields but the problem to solve often boils down to a prediction. Some pretty famous examples of usage are:

how will the rates of infection from Covid develop over specific locations in the following weeks and months?
which advertisements to put in front of a specific user’s eyes to maximize the chance that they will stop scrolling, click and buy?
how does my iPhone correctly identifies its owner’s face and unlock?
can we estimate future rates of reoffending for people convicted of crimes?
how to evaluate workers performance in any given field?²

There are also less popular and yet not less interesting applications, such as:

how will a community’s population evolve in the coming years, in regards to phenomena like gentrification and segregation?³
how to help doctors in making more informed diagnostic and treatment decisions?
how to ensure a prediction is made for ethical reasons?
how do we isolate, recognize and measure unfairness?

Classical predictive models

There are several ways to try to predict the future. A few common ways are:

based on human-set rules,
based on models of reality,
based on statistical knowledge of historical behavior.

Let’s look at each of them because ML draws concepts from each.

Human-set rules

This is what I refer to as the “classical” way of controlling and predicting behavior in artificial systems. An example would be: if the industrial machine for baking cookies reaches 90 C° on the outside surface, shut it off automatically before it burns. This rule (90 C° external sensor → shut down) is set in this example by a human expert. They know that:

90 C° is above the normal temperatures during usage, and
it’s far enough above the normal to be dangerous to the materials used or to the factory in which the machine is installed or to the humans operating it.

There are fields in which this approach is the gold standard, and probably should be so for a long time (e.g. nuclear power plants control systems). Still, this approach represents some of the roots of ML.

Models of reality

Many engineering design works are first represented in a computer model through programming and design languages. For instance, a highway bridge will be represented in a computer before being built. The same will happen during the design of a space rocket⁴. The softwares employed allow for artificial perturbation of conditions, like introducing strong winds or earthquakes to check and see what would happen to the bridge as designed.

Statistics and historical behavior

There are several assumptions that we make when trying to infer conclusions from historical behavior using statistics. I won’t talk about all of them here, just one: the concept of a “data generating machine”. The idea is that phenomena are directed by invisible, highly complex mathematical functions that for a set of values of variables (the input of the function) give an outcome to the phenomena (the output of the function).

A (made-up) example

Variables
number of trains passing on the same tracks today
number of passengers for each of those trains
detailed weather characteristics
experience of train staff including conductors

Generates
number of minutes a train will be late at each station

Once again this example is 100% made up. All I’m trying to picture is the hypothesis, made in statistics, of the existence of a specific relationship between variables (the characteristics of the causes of the phenomenon) and the outcome of the phenomenon itself (in this case, how many minutes the train will be late). The idea of a data generating machine is that there exists a maths function that formalizes this relationship and assigns unique values of the outcome to each set of inputs.

Much of the predictive statistical modeling work is to try and figure out this function⁵ as accurately as possible. There are many possible techniques. We’re getting closer to machine learning.

Inference of an approximate function

A regression is one of these statistical predictive techniques and is used to extract this function from a set of data. The data consists of sets of variable values and outcomes. We call the variables “features” and the outcome “class” or “dependent/target variable”⁶.

Another example

Let’s suppose that the phenomenon in question is the causal connection between the number of years of formal education that a person had and their current monthly revenue. We’re hypothesizing that the first determines the second. These might be the data we have (made up, but realistic⁷):

The process of applying regression might extract this function, in pink:

This function does not provide, for each value of the feature, values of the phenomenon that exactly mirror the input data. Instead, for each value of X it provides the value lying on the segment drawn. It’s just the best function that can be inferred from the data given the algorithm chosen by the operator; in this case, the algorithm is a “linear regression with 1 regressor and the OLS optimization function”.

A deep dive on linear regression is here in a 30 minute really well-made video by a fantastic YouTube channel, StatQuest.

Machine learning predictive models

All of these three concepts play a role in ML. Applying ML to a predictive problem means all of the following:

Modeling reality, that is creating a model of reality - although the model is automatically inferred instead of being intelligently designed by the operator;
A model that is (often) based on rules - although:
- the rules might be so complex they don’t make sense to the human, or
- the rules might look like different types of rules than what you would expect⁸, or, finally,
- the rules might not look like rules at all.
A model whose behavior and/or rules are (often) inferred from statistical, historical data⁹.

The purpose of creating this model is to try and predict the behavior of our phenomenon of interest based on the data (circumstances and outcomes, a.k.a. features and target variables) that same phenomenon generated in the past.

Next articles…

I’m thinking some interesting points to touch in the next parts are:

what is the role of a human operator in ML?
What are objective functions?
What are some of the most crucial issues with this?
What are interesting research directions right now?
How does ML influence society?

Do let me know some points you’d like to read about, as well as any questions!! I’m particularly interested on how accessible this text was for non-experts. Looking forward for your feedback!

Why I don’t call what I do “artificial intelligence”

While I use machine learning, ML, or sometimes applied or computational statistics to describe¹ what I do, the term artificial intelligence, AI, is now very widespread in society, industry, politics and marketing. In Italian too, where the shorthand is very similar: IA. However, I try to refrain from using it as much as possible. Why? For coherence, cultural and societal concerns.

AI is overused and already characterized in literature and cinema.

While we like to think that everyone now knows what we do, we’re nowhere near that place. There’s still millions of people who, when hearing artificial intelligence, think of machines in The Matrix. They think of consciousness, of AI-human wars, even of AI being alive. They know that AI is either malevolent and something to stop or a benevolent force that will consciously help humanity. It sounds silly? A Google engineer sometime ago thought that their chat model was alive, and recently a letter by a few AI practitioners foresaw a danger of a war against AI.

AI is already very characterized in pop culture and what we’re doing has nothing to do with that.

I don’t think what we create qualifies as intelligence.

The question of “what is intelligence” is philosophical in nature and possibly will never have one uniquely correct answer. For me, the main component of intelligence is creativity. When we see something that’s scolding hot, and we want to move it, we might poke it with a stick, protect our hand with a thick glove, kick it quickly so that our skin is not burned, or more. We might even come up with something that nobody ever did, ever. We all find it strange to think that something so simple might have a historically unique answer, but in the end, everything that exists was first made by someone, or a team, for the first time ever.

A computer isn’t able to create. If nobody ever did something, a computer won’t invent it.²

And I’ll hazard a prediction: a computer will never be able to actually be creative. Of course, what it means to be creative is also a philosophical question. Painting something beautiful used to be used as an example of creativity, but a computer can emulate it by using known painting and image patterns, rearranging them randomly or according to a distribution, joining different known techniques and patterns in a new way, and it ends up that not every painting is actually an act of creativity. Who would have guessed?

Of course there are more components to intelligence. Memory, the ability to learn from knowledge and experience, the ability to make calculations. And while the computer obviously has memory and calculation power, the ability of a computer to learn I would reckon is nowhere near the way that a human learns. A computer learns, when we’re getting to the nitty gritty, to optimize with math and statistics a mathematical function. What is that? Who chooses the function? Who chooses the metrics? Who chooses the datasets and the algorithms? Humans do, because that requires real reasoning, real intelligence.

And while there are some parallelisms between machine learning and the sociology of human growth and human acquired behavior - where it is encouraged or discouraged by the social groups we find ourselves in - I still think there’s way enough difference to see the two processes of learning as deeply different.

Much of the human learning process has to do with rewards, the human necessity of belonging, of social recognition, with the very human emotion of fear, of love, with the existence of death.

Can rewards and punishments be emulated well by mathematical functions? I don’t know. Maybe? Possibly not, possibly never? But not today, for sure.

It makes it sound like AI is not the work of humans, or that the results of AI are not the work of humans.

This is crucial. An AI denied your mortgage application? No, a human did that. Most likely a team. We as machine learning developers and data scientists need to own the results of our work. Especially its deficiencies, especially its biases, its idiosyncrasies, its reinforcements of historical unfairness. Just as well as its successes.

When we train our models on historical data without accounting that historical data paints a picture of an unfair world, then our ML models will replicate that unfairness. Experts know this all too well, in fact it’s taught in data science courses. Computers don’t have ethics, they don’t see the bias themselves. They don’t know what discrimination is, and even if we taught them that (again, with mathematical functions³), it is only humans that can tell a computer that discrimination is bad. Is adherence to the optimization of mathematical functions the same, or will it ever be the same than a fair mind, empathy, the experience of pain and the hope for a better future? Maybe. Maybe not, maybe never. But for sure it is only humans that can tell an algorithm what to optimize.

Who creates the content?

The only reason ChatGPT is able to write your college essay is that it has read billions of college essays. So if the only way that AI can produce results is based on humans’ work, is AI really anything at all without the human experience? I would argue, not.

In fact, it is the developers and financiers of ChatGPT that are writing your college essay. And all the millions of people that authored those billions of pieces of original work.

My conclusions

What I think should really be at the forefront of social discussion is the impact and consequences of AI. The European Union - following the GDPR work in privacy⁴ - is doing massive work in AI regulation which looks to be a good step forward, but this discussion cannot be left to experts only. We need to decide as a society how and in what direction to employ our collective efforts. And the place of experts is to educate, yes, but most importantly to own our work, its results and its impact.

We are data scientists developing machine learning algorithms. We are the artificial intelligence. And - we are not so artificial ourselves, and our computers are not so very intelligent, at all.

Not much of a description, yes. As I am proof reading this, I’ve realized my next post should probably be “How would I describe what I do?”. ↩︎
An exception is, if we asked a computer to list 1 million things that could work for moving something scolding, and then looked at those million ideas, there may be something new, but not because of intelligence, but because there were 1 million minus one silly ideas. ↩︎
Today, it’s not even clear how we would model discrimination and make it part of our loss functions. I read some really cool ideas though. ↩︎
Work which, while good, is not perfect at all. Already it looks like the anti-unsolicited marketing communications is being hollowed out by a legitimate interest interpretation that almost completely empties all the GDPR’s protections against personal data usage for commercial reasons. ↩︎

2022 Aug 11

Facebook’s latest (production!) version of their AI chatbot spews fake news about American politics, @gruber reports. Many Facebook users will think that FB’s chatbot might have particular authority as it belongs to the platform itself.

404: responsibility not found.

2022 May 16

We all agree privacy is great. (And possibly, a fundamental human right.) But which specific topics are you only comfortable talking about on a privacy-respecting platform/medium?

2022 Apr 29

I wanted to try something regarding data. Data science enthusiasts, data analysis professionals, evidence-informed activists and decision makers, machine learning and AI researchers, data visualization designers and developers, and more… Are you here on Micro.blog? 🔬