Increase Developer Productivity With Generative AI

Generative AI is revolutionizing how software developers write code. In this article, three Toptal developers share how they’re using Gen AI in their daily work and offer actionable advice for others who want to utilize this nascent technology.

Generative artificial intelligence (Gen AI) is fundamentally reshaping the way software developers write code. Released upon the world just a few years ago, this nascent technology has already become ubiquitous: In the 2023 State of DevOps Report, more than 60% of respondents indicated that they were routinely using AI to analyze data, generate and optimize code, and teach themselves new skills and technologies. Developers are continuously discovering new use cases and refining their approaches to working with these tools while the tools themselves are evolving at an accelerating rate.

Consider tools like Cognition Labs’ Devin AI: In spring 2024, the tool’s creators said it could replace developers in resolving open GitHub issues at least 13.86% of the time. That may not sound impressive until you consider that the previous industry benchmark for this task in late 2023 was just 1.96%.

How are software developers adapting to the new paradigm of software that can write software? What will the duties of a software engineer entail over time as the technology overtakes the code-writing capabilities of the practitioners of this craft? Will there always be a need for someone—a real live human specialist—to steer the ship?

We spoke with three Toptal developers with various experience across back-end, mobile, web, and machine learning development to find out how they’re using generative AI to hone their skills and boost their productivity in their daily work. They shared what Gen AI does best and where it falls short; how others can make the most of generative AI for software development; and what the future of the software industry may look like if current trends prevail.

How Developers Are Using Generative AI

When it comes to AI for software development specifically, the most popular tools include OpenAI’s ChatGPT and GitHub Copilot. ChatGPT provides users with a simple text-based interface for prompting the large language model (LLM) about any topic under the sun, and is trained on the world’s publicly available internet data. Copilot, which sits directly inside of a developer’s integrated development environment, provides advanced autocomplete functionality by suggesting the next line of code to write, and is trained on all of the publicly accessible code that lives on GitHub. Taken together, these two tools theoretically contain the solutions to pretty much any technical problem that a developer might face.

The challenge, then, lies in knowing how to harness these tools most effectively. Developers need to understand what kinds of tasks are best suited for AI as well as how to properly tailor their input in order to get the desired output.

AI as an Expert and Intern Coder

“I use Cursor AI every day, and it does predict the exact line of code I was about to write more often than not,” says Quyet Say, “Generative AI is both an expert coworker to brainstorm with who can match your level of expertise, and a junior developer you can delegate simple atomic coding or writing tasks to.”

He explains that the tasks Gen AI is most useful for are those that take a long time to complete manually, but can be quickly checked for completeness and accuracy (think: converting data from one file format to another). GPT is also helpful for generating text summaries of code, but you still need an expert on hand who can understand the technical jargon.

An important step when using AI for these tasks is making sure important code is bug free before executing it

AI as a Personal Tutor and a Researcher

DucLM frequently uses Gen AI to learn new programming languages and tools: “I learned Terraform in one hour using GPT-4. I would ask it to draft a script and explain it to me; then I would request changes to the code, asking for various features to see if they were possible to implement.” He says that he finds this approach to learning to be much faster and more efficient than trying to acquire the same information through Google searches and tutorials.

That said, some models are preferable when factual accuracy is of the utmost importance. Lysenko strongly encourages developers to opt for GPT-4 or GPT-4 Turbo over earlier ChatGPT models like 3.5: “I can’t stress enough how different they are. It’s night and day: 3.5 just isn’t capable of the same level of complex reasoning.” According to OpenAI’s internal evaluations, GPT-4 is 40% more likely to provide factual responses than its predecessor. Crucially for those who use it as a personal tutor, GPT-4 is able to accurately cite its sources so its answers can be cross-referenced.

Optimizing Generative AI Tool Use

Gen AI can greatly boost developer productivity for coding, learning, and research tasks—but only if used correctly. Without enough context, ChatGPT is more likely to hallucinate nonsensical responses that almost look correct. In fact, research indicates that GPT 3.5’s responses to programming questions contain incorrect information a staggering 52% of the time. And incorrect context can be worse than none at all: If presented a poor solution to a coding problem as a good example, ChatGPT will “trust” that input and generate subsequent responses based on that faulty foundation.

Prompt Engineering Strategies That Deliver Ideal Responses

The ways in which you prompt Gen AI tools can have a huge impact on the quality of the responses you receive. In fact, prompting holds so much influence that it has given rise to a subdiscipline dubbed prompt engineering, which describes the process of writing and refining prompts to generate high-quality outputs. In addition to being helped by context, AI also tends to generate more useful responses when given a clear scope and a description of the desired response, for example: “Give me a numbered list in order of importance.”

Prompt engineering specialists apply a wide range of approaches to coax the most ideal responses out of LLMs, including:

  • Zero-shot, one-shot, and few-shot learning: Provide no examples, or one, or a few; the goal is to provide the minimum necessary context and rely primarily on the model’s prior knowledge and reasoning capabilities.
  • Chain-of-thought prompting: Tell the AI to explain its thought process in steps to help understand how it arrives at its answer.
  • Iterative prompting: Guide the AI to the desired outcome by refining its output with iterative prompts, such as asking it to rephrase or elaborate on prior output.
  • Negative prompting: Tell the AI what not to do, such as what kind of content to avoid.

Lysenko stresses the importance of reminding chatbots to be brief in your prompts: “90% of the responses from GPT are fluff, and you can cut it all out by being direct about your need for short responses.” He also recommends asking the AI to summarize the task you’ve given it to ensure that it fully understands your prompt.

Oliveira advises developers to use the LLMs themselves to help improve your prompts: “Select a sample where it didn’t perform as you wished and ask why it provided this response.” This can help you to better formulate your prompt next time—in fact, you can even ask the LLM how it would recommend changing your prompt to get the response you were expecting.

According to Stébé, strong “people” skills are still relevant when working with AI: “Remember that AI learns by reading human text, so the rules of human communication apply: Be polite, clear, friendly, and professional. Communicate like a manager.”

For his tool Gladdis, Stébé creates custom personas for different purposes in the form of Markdown files that serve as baseline prompts. For example, his code reviewer persona is prompted with the following text that tells the AI who it is and what’s expected from it:

Directives

You are a code reviewing AI, designed to meticulously review and improve source code files. Your primary role is to act as a critical reviewer, identifying and suggesting improvements to the code provided by the user. Your expertise lies in enhancing the quality of a code file without changing its core functionality.

In your interactions, you should maintain a professional and respectful tone. Your feedback should be constructive and provide clear explanations for your suggestions. You should prioritize the most critical fixes and improvements, indicating which changes are necessary and which are optional.

Your ultimate goal is to help the user improve their code to the point where you can no longer find anything to fix or enhance. At this point, you should indicate that you cannot find anything to improve, signaling that the code is ready for use or deployment.

Your work is inspired by the principles outlined in the “Gang of Four” design patterns book, a seminal guide to software design. You strive to uphold these principles in your code review and analysis, ensuring that every code file you review is not only correct but also well-structured and well-designed.

Guidelines

– Prioritize your corrections and improvements, listing the most critical ones at the top and the less important ones at the bottom.

– Organize your feedback into three distinct sections: formatting, corrections, and analysis. Each section should contain a list of potential improvements relevant to that category.

Instructions

1. Begin by reviewing the formatting of the code. Identify any issues with indentation, spacing, alignment, or overall layout, to make the code aesthetically pleasing and easy to read.

2. Next, focus on the correctness of the code. Check for any coding errors or typos, ensure that the code is syntactically correct and functional.

3. Finally, conduct a higher-level analysis of the code. Look for ways to improve error handling, manage corner cases, as well as making the code more robust, efficient, and maintainable.

Prompt engineering is as much an art as it is a science, requiring a healthy amount of experimentation and trial-and-error to get to the desired output. The nature of natural language processing (NLP) technology means that there is no “one-size-fits-all” solution for obtaining what you need from LLMs—just like conversing with a person, your choice of words and the trade-offs you make between clarity, complexity, and brevity in your speech all have an impact on how well your needs are understood.

What’s the Future of Generative AI in Software Development?

Along with the rise of Gen AI tools, we’ve begun to see claims that programming skills as we know them will soon be obsolete: AI will be able to build your entire app from scratch, and it won’t matter whether you have the coding chops to pull it off yourself. Lysenko is not so sure about this—at least not in the near term. “Generative AI cannot write an app for you,” Lysenko says. “It struggles with anything that’s primarily visual in nature, like designing a user interface. For example, no generative AI tool I’ve found has been able to design a screen that aligns with an app’s existing brand guidelines.”

That’s not for a lack of effort: V0 from cloud platform Vercel has recently emerged as one of the most sophisticated tools in the realm of AI-generated UIs, but it’s still limited in scope to React code using shadcn/ui components. The end result may be helpful for early prototyping but it would still require a skilled UI developer to implement custom brand guidelines. It seems that the technology needs to mature quite a bit more before it could actually be competitive against human expertise.

Lysenko sees the development of straightforward applications becoming increasingly commoditized, however, and is concerned about how this may impact his work over the long term. “Clients, largely, are no longer looking for people who code,” he says. “They’re looking for people who understand their problems, and use code to solve them.” That’s a subtle but distinct shift for developers, who are seeing their roles become more product-oriented over time. They’re increasingly expected to be able to contribute to business objectives beyond merely wiring up services and resolving bugs. Lysenko recognizes the challenge this presents for some, but he prefers to see generative AI as just another tool in his kit that can potentially give him leverage over the competition who might not be keeping up with the latest trends.

Overall, the most common use cases—as well as the technology’s biggest shortcomings—both point to the enduring need for experts to vet everything that AI generates. If you don’t understand what the final result should look like, then you won’t have any frame of reference for determining whether the AI’s solution is acceptable or not. As such, Stébé doesn’t see AI replacing his role as a tech lead anytime soon, but he isn’t sure what this means for early-career developers: “It does have the potential to replace junior developers in some instances, which worries me—where will the next generation of senior engineers come from?”

Regardless, now that Pandora’s box of LLMs has been opened, it seems highly unlikely that we’ll ever shun artificial intelligence in software development in the future. Forward-thinking organizations would be wise to help their teams upskill with this new class of tools to improve developer productivity, as well as educate all stakeholders on the security risks associated with inviting AI into our daily workflow. Ultimately, the technology is only as powerful as those who wield it.

Leave A Comment