AI/LLM

The AI Engineer

By

Ross Hale

on •

Jul 25, 2024

Over the past year we’ve witnessed what so often happens during platform shifts: a convergence of roles. 

Previously, in the cloud era, there was a convergence of Dev & Ops. 

Previous to that, a convergence of frontend and backend (including, but not limited to, database admins and web developers). 

In all of these cases, what makes the change possible? It’s the platform: what was previously complicated, expensive and slow becomes easy, cheap, and fast.  Thanks to the packaging-up of all that complexity into a beautiful black box. 

In the case of Gen AI, pre-trained models (both commercial and open source) have created a “platform” layer that is approachable enough for an engineer (read: not a mathematician or scientist) to use effectively. 

In that previous world, if you wanted to do serious AI or ML, you had to hire a team of data scientists, acquire massive amounts of data, and spend an inordinate amount of time building and training models from scratch. It was, in effect, a “cold start”. The time to value was long and by no means guaranteed. Oftentimes those teams would fail to produce a useful result due to the technical complexity, lack of data, or simply because the problem turned out to be too difficult to solve. 

The new breed of AI, which started with models based on the transformer architecture, solves this “cold start”. Foundation models come pre-packed with incredible capabilities right out of the box. Fine tuning requires a fraction of the data and time, comparably.  And time-to-value is therefore radically reduced. As is the time to validate feasibility of modeling a problem. 

And that’s where the AI Engineer comes in.

But what exactly is an AI Engineer and what does their role entail?

There are two sides of the AI coin as it relates to software development: 

  1. Engineers who are experts at building AI powered applications and systems.

  2. Engineers who effectively utilize AI tooling in their development workflow.

When I (and the industry) talk about AI Engineers, we’re really talking about the first one:  Engineers trained in understanding and practicing the specific technologies necessary to effectively build the new breed of AI applications.

And, as an aside, these new skills aren’t just limited to engineers -- it goes for product, design, and other functions as well!

Ultimately, we’re looking for the ability to form small teams who can build great software that works reliably and get it deployed to production.  That means a new type of engineer able to cut across some of the previously defined disciplines of full stack development, data engineering, and data science / machine learning.

Here’s a relevant image from around a year ago from the blog post that many credit with getting this conversation started:

A year later, I snapped this photo of an updated version at the “AI Engineer World’s Fair”, the latest AI Engineering conference up in San Francisco:

This one gets much closer to Artium’s internal view of the world, but I think there’s even one more modification that we at Artium would make:

This is the convergence: A combination of both market-facing full stack product development and AI-specific expertise.

Which is, to be fair, quite a bit!  Luckily as developer tools and abstractions improve (and AI is itself playing a big role in that improvement), senior developers cutting across this much of the stack becomes more and more possible.

Note that ML, Data Science, and AI research specialities still play a major role here.  They have the expertise and ability to build models from scratch (still a very relevant part of the more advanced systems), perform training, quantize models, and much much more.  They effectively know how to work inside our beautiful black box whereas the AI engineer knows about the inside of the box without necessarily being able to recreate it.

AI Engineering is going to play a huge role in driving this next cycle of tech.  What gets built, how it gets built, and overall how successfully we drive innovation and improvement of the software landscape through this new technology is all, ultimately, up to the people doing the building.

That’s why we’re looking to combine all the goodness of full-stack, agile software developers with expertise in AI models, testing & evaluation, and data engineering.  To create small, effective teams who bring excellent and reliable products to market.

Now let’s go build :)

Over the past year we’ve witnessed what so often happens during platform shifts: a convergence of roles. 

Previously, in the cloud era, there was a convergence of Dev & Ops. 

Previous to that, a convergence of frontend and backend (including, but not limited to, database admins and web developers). 

In all of these cases, what makes the change possible? It’s the platform: what was previously complicated, expensive and slow becomes easy, cheap, and fast.  Thanks to the packaging-up of all that complexity into a beautiful black box. 

In the case of Gen AI, pre-trained models (both commercial and open source) have created a “platform” layer that is approachable enough for an engineer (read: not a mathematician or scientist) to use effectively. 

In that previous world, if you wanted to do serious AI or ML, you had to hire a team of data scientists, acquire massive amounts of data, and spend an inordinate amount of time building and training models from scratch. It was, in effect, a “cold start”. The time to value was long and by no means guaranteed. Oftentimes those teams would fail to produce a useful result due to the technical complexity, lack of data, or simply because the problem turned out to be too difficult to solve. 

The new breed of AI, which started with models based on the transformer architecture, solves this “cold start”. Foundation models come pre-packed with incredible capabilities right out of the box. Fine tuning requires a fraction of the data and time, comparably.  And time-to-value is therefore radically reduced. As is the time to validate feasibility of modeling a problem. 

And that’s where the AI Engineer comes in.

But what exactly is an AI Engineer and what does their role entail?

There are two sides of the AI coin as it relates to software development: 

  1. Engineers who are experts at building AI powered applications and systems.

  2. Engineers who effectively utilize AI tooling in their development workflow.

When I (and the industry) talk about AI Engineers, we’re really talking about the first one:  Engineers trained in understanding and practicing the specific technologies necessary to effectively build the new breed of AI applications.

And, as an aside, these new skills aren’t just limited to engineers -- it goes for product, design, and other functions as well!

Ultimately, we’re looking for the ability to form small teams who can build great software that works reliably and get it deployed to production.  That means a new type of engineer able to cut across some of the previously defined disciplines of full stack development, data engineering, and data science / machine learning.

Here’s a relevant image from around a year ago from the blog post that many credit with getting this conversation started:

A year later, I snapped this photo of an updated version at the “AI Engineer World’s Fair”, the latest AI Engineering conference up in San Francisco:

This one gets much closer to Artium’s internal view of the world, but I think there’s even one more modification that we at Artium would make:

This is the convergence: A combination of both market-facing full stack product development and AI-specific expertise.

Which is, to be fair, quite a bit!  Luckily as developer tools and abstractions improve (and AI is itself playing a big role in that improvement), senior developers cutting across this much of the stack becomes more and more possible.

Note that ML, Data Science, and AI research specialities still play a major role here.  They have the expertise and ability to build models from scratch (still a very relevant part of the more advanced systems), perform training, quantize models, and much much more.  They effectively know how to work inside our beautiful black box whereas the AI engineer knows about the inside of the box without necessarily being able to recreate it.

AI Engineering is going to play a huge role in driving this next cycle of tech.  What gets built, how it gets built, and overall how successfully we drive innovation and improvement of the software landscape through this new technology is all, ultimately, up to the people doing the building.

That’s why we’re looking to combine all the goodness of full-stack, agile software developers with expertise in AI models, testing & evaluation, and data engineering.  To create small, effective teams who bring excellent and reliable products to market.

Now let’s go build :)

Over the past year we’ve witnessed what so often happens during platform shifts: a convergence of roles. 

Previously, in the cloud era, there was a convergence of Dev & Ops. 

Previous to that, a convergence of frontend and backend (including, but not limited to, database admins and web developers). 

In all of these cases, what makes the change possible? It’s the platform: what was previously complicated, expensive and slow becomes easy, cheap, and fast.  Thanks to the packaging-up of all that complexity into a beautiful black box. 

In the case of Gen AI, pre-trained models (both commercial and open source) have created a “platform” layer that is approachable enough for an engineer (read: not a mathematician or scientist) to use effectively. 

In that previous world, if you wanted to do serious AI or ML, you had to hire a team of data scientists, acquire massive amounts of data, and spend an inordinate amount of time building and training models from scratch. It was, in effect, a “cold start”. The time to value was long and by no means guaranteed. Oftentimes those teams would fail to produce a useful result due to the technical complexity, lack of data, or simply because the problem turned out to be too difficult to solve. 

The new breed of AI, which started with models based on the transformer architecture, solves this “cold start”. Foundation models come pre-packed with incredible capabilities right out of the box. Fine tuning requires a fraction of the data and time, comparably.  And time-to-value is therefore radically reduced. As is the time to validate feasibility of modeling a problem. 

And that’s where the AI Engineer comes in.

But what exactly is an AI Engineer and what does their role entail?

There are two sides of the AI coin as it relates to software development: 

  1. Engineers who are experts at building AI powered applications and systems.

  2. Engineers who effectively utilize AI tooling in their development workflow.

When I (and the industry) talk about AI Engineers, we’re really talking about the first one:  Engineers trained in understanding and practicing the specific technologies necessary to effectively build the new breed of AI applications.

And, as an aside, these new skills aren’t just limited to engineers -- it goes for product, design, and other functions as well!

Ultimately, we’re looking for the ability to form small teams who can build great software that works reliably and get it deployed to production.  That means a new type of engineer able to cut across some of the previously defined disciplines of full stack development, data engineering, and data science / machine learning.

Here’s a relevant image from around a year ago from the blog post that many credit with getting this conversation started:

A year later, I snapped this photo of an updated version at the “AI Engineer World’s Fair”, the latest AI Engineering conference up in San Francisco:

This one gets much closer to Artium’s internal view of the world, but I think there’s even one more modification that we at Artium would make:

This is the convergence: A combination of both market-facing full stack product development and AI-specific expertise.

Which is, to be fair, quite a bit!  Luckily as developer tools and abstractions improve (and AI is itself playing a big role in that improvement), senior developers cutting across this much of the stack becomes more and more possible.

Note that ML, Data Science, and AI research specialities still play a major role here.  They have the expertise and ability to build models from scratch (still a very relevant part of the more advanced systems), perform training, quantize models, and much much more.  They effectively know how to work inside our beautiful black box whereas the AI engineer knows about the inside of the box without necessarily being able to recreate it.

AI Engineering is going to play a huge role in driving this next cycle of tech.  What gets built, how it gets built, and overall how successfully we drive innovation and improvement of the software landscape through this new technology is all, ultimately, up to the people doing the building.

That’s why we’re looking to combine all the goodness of full-stack, agile software developers with expertise in AI models, testing & evaluation, and data engineering.  To create small, effective teams who bring excellent and reliable products to market.

Now let’s go build :)