Spring AI + Gemini Tutorial: Your First API Call in Java (2026)

Spring AI is the official Spring project for integrating Large Language Models into Java applications. This tutorial walks through calling Google's Gemini API from a Spring Boot app in under 30 lines of Java, using the free Gemini Developer API tier (no credit card required).

If you're wondering whether AI work in Java is viable in 2026, the short answer is yes. Spring AI reached GA in 2025 and is now production-ready for the integration layer of LLM applications.

Quick Answer

To call Gemini from a Spring Boot application using Spring AI:

Add the starter: spring-ai-starter-model-google-genai (with the Spring AI BOM in pom.xml)
Get a free API key at aistudio.google.com/apikey — Google account only, no card
Configure gemini-2.5-flash in application.yml with your API key via environment variable
Inject ChatClient.Builder in a controller and call .prompt().user(message).call().content()

Working code is on GitHub: github.com/TheRavi/spring-ai-recipes.

What is Spring AI?

Spring AI is an application framework for AI engineering in Java, modeled on familiar Spring conventions (dependency injection, auto-configuration, properties-based setup). It provides a typed wrapper around major LLM providers — Google Gemini, Anthropic Claude, OpenAI, Ollama, Amazon Bedrock, Azure OpenAI, and others — so you can swap providers by changing configuration rather than code.

For Java teams already running Spring Boot in production, Spring AI lowers the integration cost of adding LLM capabilities dramatically compared to hand-rolled HTTP clients.

Why start with Gemini for a Spring AI tutorial?

The Gemini Developer API has a genuinely free tier: no credit card, no expiring trial credits, no minimum purchase. Sign in with a Google account, generate a key, start making calls.

For learning Spring AI, that matters. The gemini-2.5-flash model on the free tier handles thousands of hello-world calls per day without spending a dollar — enough to build, test, and iterate on real integrations.

Prerequisites

Java 25 (or Java 21 LTS minimum — adjust <java.version> in pom.xml)
Maven 3.9+ (the project ships with the Maven wrapper)
A free Gemini API key from aistudio.google.com/apikey

Step 1: Generate the Spring Boot project

Skipping ahead? Clone the recipe folder and jump to Step 5.

Go to start.spring.io and pick:

Project: Maven
Language: Java
Spring Boot: 3.5.x (latest GA)
Dependencies: Spring Web

Spring Initializr may not yet offer a Google GenAI option in the UI, so we'll add that dependency manually in the next step.

Step 2: Add the Spring AI Google GenAI starter

In your pom.xml, add the starter:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-google-genai</artifactId>
</dependency>

xml

Then import the Spring AI BOM so all Spring AI artifacts use consistent versions:

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-bom</artifactId>
            <version>1.1.6</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

xml

Versions move quickly in the Spring AI ecosystem. Check the latest Spring AI release before pinning — if 1.1.6 doesn't resolve, the most recent 1.1.x patch will work. The BOM version is the one that matters; individual starter versions inherit from it.

Step 3: Configure the Gemini API

In src/main/resources/application.yml:

spring:
  application:
    name: hello-gemini
  ai:
    google:
      genai:
        api-key: ${GEMINI_API_KEY}
        chat:
          options:
            model: gemini-2.5-flash
            max-output-tokens: 1024
            temperature: 0.7

yaml

A few configuration choices worth understanding:

Externalize the API key. The ${GEMINI_API_KEY} placeholder reads from your environment at runtime. Never commit a real key.
Don't set project-id or location. Those properties switch the client to Vertex AI (paid GCP) mode. For the free Gemini Developer API, set only api-key.
Pin the model explicitly. Framework defaults shift between Spring AI versions. Pinning ensures consistent behavior across deployments.
Use max-output-tokens, not max-tokens. The Gemini configuration key differs from Anthropic's. Using the wrong key results in silently uncapped output.
temperature: 0.7 is a balanced middle ground. Lower values (0.0–0.3) for factual tasks; higher (0.8–1.0) for creative output.

Keeping your API key out of source control

The safest local pattern is an application-local.yml file that Spring Boot merges at runtime but that's gitignored:

src/main/resources/application-local.yml (never committed):

spring:
  ai:
    google:
      genai:
        api-key: your-real-key-here

yaml

Run with the local profile:

./mvnw spring-boot:run -Dspring-boot.run.profiles=local

bash

This keeps application.yml clean and committable with the ${GEMINI_API_KEY} placeholder intact. It also avoids the most common cause of "API key not valid" errors — environment variables that don't reach the JVM (see gotcha #1 below).

Step 4: Write the controller

Create a REST controller that accepts a user message and returns Gemini's response as typed JSON:

package com.example.hellogemini;
 
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
 
@RestController
public class ChatController {
 
    private final ChatClient chatClient;
 
    public ChatController(ChatClient.Builder builder) {
        this.chatClient = builder.build();
    }
 
    @GetMapping("/chat")
    public ChatResponse chat(
            @RequestParam(defaultValue = "Hello, Gemini. Briefly introduce yourself.") String message) {
        var reply = chatClient
                .prompt()
                .user(message)
                .call()
                .content();
 
        return new ChatResponse(message, reply);
    }
 
    public record ChatResponse(String prompt, String reply) {}
}

java

Three things worth understanding about this code:

ChatClient.Builder is auto-configured. Adding the Google GenAI starter triggers Spring Boot to wire up a builder pre-configured for Gemini. You inject it, call .build(), and you have a working client.
The fluent API mirrors prompt structure. .prompt().user(message).call().content() reads top-to-bottom as "build a prompt, set the user message, make the call, return the content."
ChatResponse is a Java record. A one-liner that gives you an immutable data class — Spring Boot serializes it to JSON automatically.

Step 5: Run and test

Start the application:

./mvnw spring-boot:run -Dspring-boot.run.profiles=local

bash

In another terminal, call the endpoint:

curl "http://localhost:8080/chat?message=What%20is%20Spring%20AI%20in%20one%20sentence%3F"

bash

You should see a JSON response with both your prompt and Gemini's reply:

{
  "prompt": "What is Spring AI in one sentence?",
  "reply": "Spring AI is an application framework..."
}

json

What just happened?

When the GET request hit /chat, Spring AI:

Wrapped your message as a user-role message in Gemini's expected format
Called Google's Generative Language API with the model, output limit, and temperature from your configuration
Received the response, extracted the text content, and returned it through the controller

Spring AI also manages the HTTP client lifecycle, handles rate limit responses, validates model parameters, and applies any configured retry logic. Your integration code stays focused on business logic, not infrastructure.

Which Gemini models work on the free tier in 2026?

As of May 2026, the current free-tier models are:

gemini-2.5-flash — recommended default. Fast, capable, generous limits.
gemini-2.5-flash-lite — lighter and faster, lower quality
gemini-2.5-pro — most capable, slower, 5 requests/minute on free tier

Check the current Gemini model list before pinning a model name in production. Gemini model names change faster than most ecosystems.

Real gotchas from building this

These aren't from the docs — they're from actually building this integration in May 2026. If your first attempt fails, one of these is almost certainly why.

1. The environment variable doesn't reach your IDE

The most common failure mode: you ran export GEMINI_API_KEY=... in a terminal, then hit "Run" in IntelliJ or VS Code. The IDE spawns a new JVM process that doesn't inherit your shell's environment, so Spring sees an empty value and Gemini returns a 400 "API key not valid" error.

Fix for IntelliJ: Run → Edit Configurations → find your run config → "Environment variables" field → add GEMINI_API_KEY=your-key.

The cleaner fix: use application-local.yml as described above. It works regardless of how you launch the app.

2. Accidentally switching to Vertex AI mode

Spring AI's Google GenAI starter supports two authentication modes — the free Gemini Developer API (API key) and paid Vertex AI (GCP credentials). If you set project-id or location anywhere in your config, even experimentally, the client switches to Vertex AI mode and your Developer API key will be rejected with a 400 auth error.

For the free tier: set only api-key. Delete any project-id or location properties.

3. `gemini-2.0-flash` returns 429 with `limit: 0`

This is a confusing error — it looks like a quota problem but it isn't. A real quota error shows your daily limit (e.g. limit: 1500). limit: 0 means your account has zero free-tier access to that model, typically because the model is deprecated and Google has already cut free-tier capacity to it.

gemini-2.0-flash is deprecated as of February 2026 and shutting down June 1, 2026. Don't use it.

4. `gemini-1.5-flash` returns 404

All Gemini 1.x model identifiers are shut down and return 404. Any tutorial recommending gemini-1.5-flash or gemini-1.5-pro is out of date.

5. `max-tokens` vs `max-output-tokens`

The Anthropic integration uses max-tokens. The Gemini integration uses max-output-tokens. Using the wrong property key results in silently uncapped output and potential cost overruns if you upgrade to a paid tier.

6. Spring AI BOM version mismatch

Spring AI starters must align with the BOM version. Mixing versions produces ClassNotFoundException, NoSuchMethodError, or cryptic auto-configuration failures at startup. Always import the BOM and let it resolve individual artifact versions — don't add explicit <version> tags to Spring AI dependencies.

Frequently asked questions

Is Spring AI production-ready?

Yes. Spring AI reached general availability in 2025 (1.0 GA), and the 1.1.x line has been stable through 2026. It's used in production by Spring shops integrating LLMs into existing Java services.

Can I use the same code with Claude or OpenAI?

Mostly. Spring AI's ChatClient abstraction is provider-portable — the controller code stays identical. You'd change the starter dependency (spring-ai-starter-model-anthropic for Claude, spring-ai-starter-model-openai for OpenAI) and the corresponding application.yml block. Model names and tuning parameters obviously differ.

Does the free tier require a credit card?

No. The Gemini Developer API free tier requires only a Google account. Vertex AI (the paid GCP option) requires a billing account, but you only enter that path if you explicitly set project-id/location in your config.

What's the difference between Spring AI 1.1 and 2.0?

Spring AI 1.1.x targets Spring Framework 6 / Spring Boot 3.x. Spring AI 2.0 (currently in milestone) targets Spring Framework 7 / Spring Boot 4.0. For a tutorial in 2026, 1.1.x is the right stable choice. Upgrade once 2.0 reaches GA.

What to build next

This is the smallest possible Spring AI integration. The framework's real value emerges in the features that build on top of ChatClient:

Structured output. Return typed Java POJOs from Gemini instead of raw strings — one of Spring AI's strongest features, and arguably where Java's type system beats Python ergonomically for AI work.
Tool calling. Let Gemini invoke your Java methods (database lookups, API calls, computations) as part of its reasoning.
Provider portability in practice. Switch from Gemini to Claude to a local Ollama model by changing configuration, not code.
Retrieval-augmented generation (RAG). Augment prompts with retrieved context from your own data.

Each deserves its own walkthrough. This is the first post in a Spring AI recipes series — next up is structured outputs, where Spring AI's Java-native ergonomics show their strongest advantage.

Java for AI engineering: a quick honest take

For prototyping, model training, and rapid experimentation, Python's ecosystem remains unmatched.

For the integration layer — taking a model and connecting it to enterprise systems that already exist — Java has quietly built a strong story. Spring AI is part of it. So is LangChain4j. So are the JVM-native agent frameworks emerging through 2025 and 2026.

If you're a Java backend engineer wondering whether AI work requires a stack change: probably not. Learn the AI fundamentals, then stay in the language you already ship production code in. The tooling has caught up.

Working code

The complete project for this tutorial is on GitHub:

github.com/TheRavi/spring-ai-recipes — 01-hello-gemini/

Clone it, add your API key to application-local.yml, and run.

Quick Answer

What is Spring AI?

Why start with Gemini for a Spring AI tutorial?

Prerequisites

Step 1: Generate the Spring Boot project

Step 2: Add the Spring AI Google GenAI starter

Step 3: Configure the Gemini API

Keeping your API key out of source control

Step 4: Write the controller

Step 5: Run and test

What just happened?

Which Gemini models work on the free tier in 2026?

Real gotchas from building this

1. The environment variable doesn't reach your IDE

2. Accidentally switching to Vertex AI mode

3. gemini-2.0-flash returns 429 with limit: 0

4. gemini-1.5-flash returns 404

5. max-tokens vs max-output-tokens

6. Spring AI BOM version mismatch

Frequently asked questions

Is Spring AI production-ready?

Can I use the same code with Claude or OpenAI?

Does the free tier require a credit card?

What's the difference between Spring AI 1.1 and 2.0?

What to build next

Java for AI engineering: a quick honest take

Working code

Further reading

3. `gemini-2.0-flash` returns 429 with `limit: 0`

4. `gemini-1.5-flash` returns 404

5. `max-tokens` vs `max-output-tokens`