Sorting out a degree in artificial intelligence

Reading course descriptions and degree plans has helped me understand more about the fields of artificial intelligence and data science. I think some universities have whipped up a program in one of these hot fields of study just to put something on the books. It’s quite unfair to students if this is just a collection of existing courses and not a deliberate, well structured path to learning.

I came across this page from Northeastern University that attempts to explain the “difference” between artificial intelligence and machine learning. (I use those quotation marks because machine learning is a subset of artificial intelligence.) The university has two different master’s degree programs for artificial intelligence; neither one has “machine learning” in its name — but read on!

Illustration by chenspec at Pixabay

One of the two programs does not require a computer science undergraduate degree. It covers data science, robotics, and machine learning.

The other master’s program is for students who do have a background in computer science. It covers “robotic science and systems, natural language processing, machine learning, and special topics in artificial intelligence.”

I noticed that data science is in the program for those without a computer science background, while it’s not mentioned in the other program. This makes sense if we understand that data science and machine learning really go hand in hand nowadays. A data scientist likely will not develop any new machine learning systems, but she will almost certainly use machine learning to solve some problems. Training in statistics is necessary so that one can select the best algorithm for use in machining learning for solving a particular problem.

Graduates of the other program, with their prior experience in computer science, should be ready to break ground with new and original AI work. They are not going to analyze data for firms and organizations. Instead, they are going to develop new systems that handle data in new ways.

The distinction between these two degree programs highlights a point that perhaps a lot of people don’t yet understand: people (like journalists who have code experience) are training models — using machine learning systems through writing code to control them — and yet they are not people who create new machine learning systems.

Separately there are developers who create new AI software systems, and engineers who create new AI hardware systems. In other words, there are many different roles in the AI field.

Finally, there are so-called AI systems sold to banks and insurance companies, and many other types of firms, for which the people using the system do not write code at all. Using them requires data to be entered, and results are generated (such as whose insurance rates will go up next year). The workers who use these systems don’t write code any more than an accountant writes code. Moreover, they can’t explain how the system works — they need only know what goes in and what comes out.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


How might we regulate AI to prevent discrimination?

Discussions about regulation of AI, and algorithms in general, often revolve around privacy and misuse of personal data. Protections against bias and unfair treatment are also part of this conversation.

In a recent article in Harvard Business Review, lawyer Andrew Burt (who might prefer to be called a “legal engineer”) wrote about using existing legal standards to guide efforts at ensuring fairness in AI–based systems. In the United States, these include the Equal Credit Opportunity Act, the Civil Rights Act, and the Fair Housing Act.

Photo by Tingey Injury Law Firm on Unsplash

Burt emphasizes the danger of unintentional discrimination, which can arise from basing the “knowledge” in the system on past data. You might think it would make sense to train an AI to do things the way your business has done things in the past — but if that means denying loans disproportionately to people of color, then you’re baking discrimination right into the system.

Burt linked to a post on the Google AI Blog that in turn links to a GitHub repo for a set of code components called ML-fairness-gym. The resource lets developers build a simulation to explore potential long-term impacts of a machine learning decision system — such as one that would decide who gets a loan and who doesn’t.

In several cases, long-term analysis via simulations showed adverse unintended consequences that arose from decisions made by ML. These are detailed in a paper by Google researchers. We can see that determining the true outcomes of use of AI systems is not just a matter of feeding in the data and getting a reliable model to churn out yes/no decisions for a firm.

It makes me wonder about all the cheerleading and hype around “business solutions” offered by large firms such as Deloitte. Have those systems been tested for their long-term effects? Is there any guarantee of fairness toward the people whose lives will be affected by the AI system’s decisions?

And what is “fair,” anyway? Burt points out that statistical methods used to detect a disparate impact depend on human decisions about “what ‘fairness’ should mean in the context of each specific use case” — and also how to measure fairness.

The same applies to the law — not only in how it is written but also in how it is interpreted. Humans write the laws, and humans sit in judgment. However, legal standards are long established and can be used to place requirements on companies that produce, deploy, and use AI systems, Burt suggests.

  • Companies must “carefully monitor and document all their attempts to reduce algorithmic unfairness.”
  • They must also “generate clear, good faith justifications for using the models” that are at the heart of the AI systems they develop, use, or sell.

If these suggested standards were applied in a legal context, it could be shown whether a company had employed due diligence and acted responsibly. If the standards were written into law, companies that deploy unfair and discriminatory AI systems could be held liable and face penalties.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


Comment moderation as a machine learning case study

Continuing my summary of the lessons in Introduction to Machine Learning from the Google News Initiative, today I’m looking at Lesson 5 of 8, “Training your Machine Learning model.” Previous lessons were covered here and here.

Now we get into the real “how it works” details — but still without looking at any code or computer languages.

The “lesson” (actually just a text) covers a common case for news organizations: comment moderation. If you permit people to comment on articles on your site, machine learning can be used to identify offensive comments and flag them so that human editors can review them.

With supervised learning (one of three approaches included in machine learning; see previous post here), you need labeled data. In this case, that means complete comments — real ones — that have already been labeled by humans as offensive or not. You need an equally large number of both kinds of comments. Creating this dataset of comments is discussed more fully in the lesson.

You will also need to choose a machine learning algorithm. Comments are text, obviously, so you’ll select among the existing algorithms that process language (rather than those that handle images and video). There are many from which to choose. As the lesson comes from Google, it suggests you use a Google algorithm.

In all AI courses and training modules I’ve looked at, this step is boiled down to “Here, we’ll use this one,” without providing a comparison of the options available. This is something I would expect an experienced ML practitioner to be able to explain — why are they using X algorithm instead of Y algorithm for this particular job? Certainly there are reasons why one text-analysis algorithm might be better for analyzing comments on news articles than another one.

What is the algorithm doing? It is creating and refining a model. The more accurate the final model is, the better it will be at predicting whether a comment is offensive. Note that the model doesn’t actually know anything. It is a computer’s representation of a “world” of comments in which some — with particular features or attributes perceived in the training data — are rated as offensive, and others — which lack a sufficient quantity of those features or attributes — are rated as not likely to be offensive.

The lesson goes on to discuss false positives and false negatives, which are possibly unavoidable — but the fewer, the better. We especially want to eliminate false negatives, which are offensive comments not flagged by the system.

“The most common reason for bias creeping in is when your training data isn’t truly representative of the population that your model is making predictions on.”

—Lesson 6, Bias in Machine Learning

Lesson 6 in the course covers bias in machine learning. A quick way to understand how ML systems come to be biased is to consider the comment-moderation example above. What if the labeled data (real comments) included a lot of comments offensive to women — but all of the labels were created by a team of men, with no women on the team? Surely the men would miss some offensive comments that women team members would have caught. The training data are flawed because a significant number of comments are labeled incorrectly.

There’s a pretty good video attached to this lesson. It’s only 2.5 minutes, and it illustrates interaction bias, latent bias, and selection bias.

Lesson 6 also includes a list of questions you should ask to help you recognize potential bias in your dataset.

It was interesting to me that the lesson omits a discussion of how the accuracy of labels is really just as important as having representative data for training and testing in supervised learning. This issue is covered in ImageNet and labels for data, an earlier post here.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


Examples of machine learning in journalism

Following on from yesterday’s post, today I looked at more lessons in Introduction to Machine Learning from the Google News Initiative. (Friday AI Fun posts will return next week.)

The separation of machine learning into three different approaches — supervised learning, unsupervised learning, and reinforcement learning — is standard (Lesson 3). In keeping with the course’s focus on journalism applications of ML, the example given for supervised learning is The Atlanta Journal-Constitution‘s deservedly famous investigative story about sex abuse of patients by doctors. Supervised learning was used to sort more than 100,000 disciplinary reports on doctors.

The example of unsupervised learning is one I hadn’t seen before. It’s an investigation of short-term rentals (such as Airbnb rentals) in Austin, Texas. The investigator used locality-sensitive hashing (LSH) to group property records in a set of about 1 million documents, looking for instances of tax evasion.

The main example given for reinforcement learning is AlphaGo (previously covered in this blog), but an example from The New York TimesHow The New York Times Is Experimenting with Recommendation Algorithms — is also offered. Reinforcement learning is typically applied when a clear “reward” can be identified, which is why it’s useful in training an AI system to play a game (winning the game is a clear reward). It can also be used to train a physical robot to perform specified actions, such as pouring a liquid into a container without spilling any.

Also in Lesson 3, we find a very brief description of deep learning (it doesn’t mention layers and weights). and just a mention of neural networks.

“What you should retain from this lesson is fairly simple: Different problems require different solutions and different ML approaches to be tackled successfully.”

—Lesson 3, Different approaches to Machine Learning

Lesson 4, “How you can use Machine Learning,” might be the most useful in this set of eight lessons. Its content comes (with permission) from work done by Quartz AI Studio — specifically from the post How you’re feeling when machine learning might help, by the super-talented Jeremy B. Merrill.

The examples in this lesson are really good, so maybe you should just read it directly. You’ll learn about a variety of unusual stories that could only be told when journalists used machine learning to augment their reporting.

“Machine learning is not magic. You might even say that it can’t do anything you couldn’t do — if you just had a thousand tireless interns working for you.”

—Lesson 4, How you can use Machine Learning

(The Quartz AI Studio was created with a $250,000 grant from the Knight Foundation in 2018. For a year the group experimented, helped several news organizations produce great work, and ran a number of trainings for journalists. Then it was quietly disbanded in early 2020.)

Note (added April 4, 2022): The two links above to Quartz AI Studio content have been updated. The original domain, qz-dot-ai, was given up when, at renewal time, the price of all dot-ai domains had skyrocketed. Unfortunately, all the images have been lost, according to a personal communication from Merrill.


Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


Google’s machine learning ‘course’ for journalists

I couldn’t resist dipping into this free course from the Google News Initiative, and what I found surprised me: eight short lessons that are available as PDFs.

The good news: The lessons are journalism-focused, and they provide a painless introduction to the subject. The bad news: This is not really a course or a class at all — although there is one quiz at the end. And you can get a certificate, for what it’s worth.

There’s a lot here that many journalists might not be aware of, and that’s a plus. You get a brief, clear description of Reuters’ News Tracer and Lynx Insight tools, both used in-house to help journalists discover new stories using social media or other data (Lesson 1). A report I recall hearing about — how automated real-estate stories brought significant new subscription revenue to a Swedish news publisher — is included in a quick summary of “robot reporting” (also Lesson 1).

Lesson 2 helpfully explains what machine learning is without getting into technical operations of the systems that do the “learning.” They don’t get into what training a model entails, but they make clear that once the model exists, it is used to make predictions. The predictions are not like what some tarot-card reader tells you but rather probability-based results that the model is able to produce, based on its prior training.

Noting that machine learning is a subset of the wider field called artificial intelligence is, of course, accurate. What is inaccurate is the definition “specific applications that use data to train a model to perform a given task independently and learn from experience.” They left out Q-learning, a type of reinforcement learning (a subset of machine learning), which does not use a model. It’s okay that they left it out, but they shouldn’t imply that all machine learning requires a trained model.

The explosion of machine learning and AI in the past 10 years is explained nicely and concisely in Lesson 2. The lesson also touches on misconceptions and confusion surrounding AI:

“The lack of an officially agreed definition, the legacy of science-fiction, and a general low level of literacy on AI-related topics are all contributing factors.”

—Lesson 2, Is Machine Learning the same thing as AI?

I’ll be looking at Lessons 3 and 4 tomorrow.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


Free courses in machine learning

Two days ago, I came upon this newly published course from FastAI: Practical Deep Learning for Coders. I actually stumbled across it via a video on YouTube, which I’ve watched now, and it made me feel optimistic about the course. I’m in the middle of the CS50 AI course from Harvard, though, so I need to hold off on the FastAI course for now.

Above: Screenshot from FastAi course

The first video got me thinking.

First, they said (as many others have said) that Python is the main programming language used for machine learning today. (This makes me happy, as I know Python.) But I wonder whether there’s more to this claim than I’m aware of.

Second, they said PyTorch has superseded TensorFlow as the framework of choice for machine learning. They said PyTorch is “much easier to use and much more useful for researchers.”

“Within the last 12 months, the percentage of papers at major conferences that use PyTorch has gone from 20 percent to 80 percent and vice versa — those that use TensorFlow have gone from 80 percent to 20 percent.”

—Jeremy Howard, in the FastAI video “Lesson 1 – Deep Learning for Coders (2020)”

Note, I don’t know if this is true. But it caught my attention.

FastAI is a library “that sits on top of PyTorch,” they explain. They say it is “the most popular higher-level API for PyTorch,” and it removes a lot of the struggle necessary to get started with PyTorch.

This leads me back to the CS50 AI course. The CS50 phenomenon was documented in The New Yorker in July 2020. One insanely popular course, Introduction to Computer Science, has spawned multiple follow-on courses, including the seven-module course about the principles of artificial intelligence in which I am currently enrolled (not for credit).

In the various online forums and Facebook groups devoted to CS50, you can see a lot of people asking whether they need to take the intro course prior to starting the AI course. Some of them admit they have never programmed before. They know nothing about coding. But they think they might take an AI programming course as their very first computer science course.

This is what I was thinking about as the speakers in the FastAI video both praised how easy FastAI makes it to train a model and cautioned that machine learning is not a task for code newbies.

Training and testing a machine learning system — a system that will make predictions to be used in some industry, some social context, where human lives might be affected — should not be dependent on someone who learned how to do it in one online course.

Some other free online courses:


Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


How weird is AI?

For Friday AI Fun, I’m sharing one of the first videos I ever watched about artificial intelligence. It’s a 10-minute TED Talk by Janelle Shane, and it’s actually pretty funny. I do mean “funny ha-ha.”

I’m not wild about the ice-cream-flavors example she starts out with, because what does an AI know about ice cream, anyway? It’s got no tongue. It’s got no taste buds.

But starting at 2:07, she shows illustrations and animations of what an AI does in a simulation when it is instructed to go from point A to point B. For a robot with legs, you can imagine it walking, yes? Well, watch the video to see what really happens.

This brings up something I’ve only recently begun to appreciate: The results of an AI doing something may be entirely satisfactory — but the manner in which it produces those results is very unlike the way a human would do it. With both machine vision and game playing, I’ve seen how utterly un-human the hidden processes are. This doesn’t scare me, but it does make me wonder about how our human future will change as we rely more on these un-human ways of solving problems.

“When you’re working with AI, it’s less like working with another human and a lot more like working with some kind of weird force of nature.”

—Janelle Shane

At 6:23 in the video, Shane shows another example that I really love. It shows the attributes (in a photo) that an image recognition system decided to use when identifying a particular species of fish. You or I would look at the tail, the fins, the head — yes? Check out what the AI looks for.

Shane has a new book, You Look Like a Thing and I Love You: How Artificial Intelligence Works and Why It’s Making the World a Weirder Place. I haven’t read it yet. Have you?

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


Uses of AI in journalism

Part of my interest in AI centers on the way it is presented in online, print and broadcast media. Another focal point for me is how journalism organizations are using AI to do journalism work.

At the London School of Economics, a project named JournalismAI mirrors my interests. In November 2019 they published a report on a survey of 71 news organizations in 32 countries. They describe the report as “an introduction to and discussion of journalism and AI.”

Above: From the JournalismAI report

Many people in journalism are aware of the use of automation in producing stories on financial reports, sports, and real estate. Other applications of AI (mostly machine learning) are less well known — and they are numerous.

Above: From page 32 in JournalismAI report

Another resource available from JournalismAI is a collection of case studies — in the form of a Google sheet with links to write-ups about specific projects at news organizations. This list is being updated as new cases arise.

Above: From the JournalismAI case studies

It’s fascinating to open the links in the case studies and discover the innovative projects under way at so many news organizations. Journalism educators (like me) need to keep an eye on these developments to help us prepare journalism students for the future of our field.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


Interrogating the size of AI algorithms

I have watched so many videos in my journey to understand how artificial intelligence and machine learning work, and one of my favorite YouTube channels belongs to Jordan Harrod. She’s a Ph.D. student working on neuroengineering, brain-machine interfaces, and machine learning.

I began learning about convolutional neural networks in my reading about AI. Like most people (?), I had a vague idea of a neural network being modeled after a human brain, with parallel processors wired together like human synapses. When you read about neural nets in AI, though, you are not reading about processors, computer chips, or hardware. Instead, you read about layers and weights. (Among other things.)

A deep neural network has multiple layers. That’s what makes it “deep.” You’ll see these layers in a simple diagram in the 4-minute video below. A convolutional neural network has hidden layers. These are not hidden as in “secret”; they are called hidden because they are sandwiched in between the input layer and and output layer.

The weights are — as with all computer data — numeric. What happens in machine learning is that the weights associated with each node in a layer are adjusted, again and again, during the process of training the AI — with an end result that the neural network’s output is more accurate, or even highly accurate.

As Harrod points out, not all AI systems include a neural network. She says that “training a model will almost always produce a set of values that correspond or are analogous to weights in a neural network.” I need to think more about that.

Now, does Harrod definitively answer the question “How big is an AI algorithm?” Not really. But she provides a nice set of concepts to help us understand why there isn’t just one simple answer to this question. She offers a glimpse at the way AI works under the hood that might make you hungry to learn more.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.


Face detection without a deep neural network

I was surprised when I watched this video about how most face detection works. Granted, this is not face recognition (identifying the specific person). Face detection looks at an image or video and can almost instantly point out all the human faces. In a consumer camera, this is part of the code that puts a rectangle around each person’s face while you’re framing your shot.

What’s wonderful in the video is how the Viola–Jones object detection framework is illustrated and explained so that even we non-math types can understand it.

Like the game cases I wrote about yesterday, this is a case where tried-and-true algorithms are used, but deep neural networks are not.

As is typical with AI, there is a model. How does the code identify a human face? It “knows” some things about the shape and proportions of human faces. But it knows these attributes (features) not as noses and eyes and mouths — as we humans do. Instead, it knows them as rectangular shapes that map very well to the pixels in a digital image.

Above: Graphic from Viola and Jones (2001) — PDF

Make sure you stay with the video until 3:30, when Mike Pound begins to draw on paper. (This drawing-by-hand is a large part of why I love the videos from Computerphile!) At 8:30 he begins drawing a face to show how the algorithm analyzes that segment of an image.

The one part that might not be clear (depending on how much time you spend thinking about pixels in images) is that the numbers in the grid he draws represent values of lightness or darkness in the image. In all cases, computers require knowledge to be represented as numbers. When dealing with images, numbers represent differences. To compare sections of an image with other sections, the numeric values for one section are added up and compared with the sum of numeric values from another section.

The animations in the final three minutes of the video provide an awesomely clear explanation of how the regions of the image are assessed and quickly discarded as “not a face” or retained for further examination.

Computers are lightning-fast at these kinds of calculations. This method is so efficient, it runs rapidly even on simple hardware — which is why this method of face detection has been in use since 2002.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.
