Images play a major role in modern web applications, including dynamic content portals, product pages, and blogs. However, creating quality alt text and captions for each uploaded image is frequently a laborious, tedious process. An uploaded image can be automatically analyzed by the AI to produce:

  • A meaningful, context-aware caption
  • SEO-friendly alt text
  • Multiple variations for each
  • Output in any language

This article shows you how to integrate Gemini AI into ASP.NET Core MVC (.NET 10) to process images and generate high-quality descriptive text.

Why Use Gemini AI for Image Captioning?
Gemini AI is trained on a massive multimodal dataset, which means:

  • It understands visual context (objects, scenes, emotions, actions)
  • It can generate human-like captions
  • It supports multiple languages
  • It can return multiple options so users can choose the best one

Features We Will Implement

  • Upload an image in ASP.NET Core MVC
  • Send the image bytes to Gemini AI
  • Ask Gemini to generate:
    • Multiple caption options
    • Multiple alt text variations
    • Output in multiple languages
  • Render results on the page
  • SEO-friendly and accessible output

Before diving into the code, first you have to create a project in Visual Studio (2026 prefer) and have a Gemini API key that you can get it from Google AI Studio .

Project Structure
Below image shows the project structure that I am following to demonstrate the implementation.

Add Gemini API Key to appsettings.json
{
  "Gemini": {
    "ApiKey": "your-api-key-here"
  }
}

You can store the API key in User Secrets for security.
Create the Image Upload View
<div class="container mt-5">
    <div class="card shadow p-4">
        <h2 class="mb-4 text-center">AI Image Caption Generator</h2>
        <form asp-action="Analyze" enctype="multipart/form-data" method="post">
            <div class="mb-3">
                <label class="form-label">Select Image</label>
                <input type="file" name="file" class="form-control" required />
            </div>
            <div class="mb-3">
                <label class="form-label">Select Language</label>
                <select name="language" class="form-select">
                    <option value="en">English</option>
                    <option value="ne">Nepali</option>
                    <option value="hi">Hindi</option>
                    <option value="es">Spanish</option>
                    <option value="fr">French</option>
                    <option value="ja">Japanese</option>
                </select>
            </div>
            <button class="btn btn-primary w-100">Analyze Image</button>
        </form>
    </div>
</div>


Create the Service to Handle Gemini Connection
This GeminiService is responsible to handle connection with the Gemini AI API with the prompt. Uploaded image converts into the base64Image as Geimin AI requires base64Image.
public class GeminiService
{
    private readonly IConfiguration _config;
    private readonly IHttpClientFactory _httpClientFactory;

    public GeminiService(IConfiguration config, IHttpClientFactory httpClientFactory)
    {
        _config = config;
        _httpClientFactory = httpClientFactory;
    }

    public async Task<(List<string> captions, List<string> alts)>
        AnalyzeImageAsync(byte[] imageBytes, string mimeType, string language = "English")
    {
        string apiKey = _config["Gemini:ApiKey"];
        if (string.IsNullOrEmpty(apiKey))
            throw new Exception("Gemini API Key missing");
        var http = _httpClientFactory.CreateClient();
        string base64Image = Convert.ToBase64String(imageBytes);
        var requestBody = new
        {
            contents = new[]
            {
            new {
                parts = new object[]
                {
                    new { text =
                        $"Analyze this image and return:" +
                        $"\n - 5 caption options" +
                        $"\n - 5 alt text options" +
                        $"\n - Language: {language}" +
                        $"\nRespond in JSON only: {{ \"captions\": [...], \"alts\": [...] }}"
                    },
                    new {
                        inline_data = new {
                            mime_type = mimeType,
                            data = base64Image
                        }
                    }
                }
            }
        }
        };
        string url =
            $"https://generativelanguage.googleapis.com/v1/models/gemini-2.5-flash:generateContent?key={apiKey}";
        var response = await http.PostAsync(
            url,
            new StringContent(JsonSerializer.Serialize(requestBody), Encoding.UTF8, "application/json")
        );

        if (!response.IsSuccessStatusCode)
        {
            string error = await response.Content.ReadAsStringAsync();
            throw new Exception($"Gemini API Error: {error}");
        }

        var json = await response.Content.ReadFromJsonAsync<JsonElement>();
        var textResponse = json
            .GetProperty("candidates")[0]
            .GetProperty("content")
            .GetProperty("parts")[0]
            .GetProperty("text")
            .GetString();

        textResponse = textResponse.Replace("```json", "").Replace("```", "");
        var resultJson = JsonDocument.Parse(textResponse).RootElement;
        var captions = resultJson.GetProperty("captions")
            .EnumerateArray().Select(x => x.GetString()).ToList();

        var alts = resultJson.GetProperty("alts")
            .EnumerateArray().Select(x => x.GetString()).ToList();

        return (captions, alts);
    }
}


Register GeminiService

In the program.cs file, add below lines of code to register HttpClient and GeminiService.
// Register HttpClient and GeminiService
builder.Services.AddHttpClient();
builder.Services.AddSingleton<GeminiService>();


Create a Controller

When the user uploads an image, the ImageController processes the form submission, converts the file into bytes, detects its MIME type, and sends the prepared image data to the GeminiService for further processing by Gemini AI.
public class ImageController: Controller
{
    private readonly GeminiService _gemini;

    public ImageController(GeminiService gemini)
    {
        _gemini = gemini;
    }

    public IActionResult Index() => View();

    [HttpPost]
    [RequestSizeLimit(10 * 1024 * 1024)] // 10 MB
    public async Task<IActionResult> Analyze(IFormFile file, string language = null)
    {
        if (file == null || file.Length == 0)
        {
            ModelState.AddModelError("file", "Please select an image file.");
            return View("Upload");
        }

        using var ms = new MemoryStream();
        await file.CopyToAsync(ms);
        var bytes = ms.ToArray();

        var result = new ResponseModel();
        try
        {
            var mimeType = MimeTypeHelper.GetMimeType(file.FileName);
            var content = await _gemini.AnalyzeImageAsync(bytes, mimeType, language);
            result.Alts = content.alts;
            result.Captions = content.captions;
        }
        catch (Exception ex)
        {
            //error handling
            TempData["Error"] = "Failed to analyze the uploaded image. Error: " + ex.Message;
            return RedirectToAction("Upload");
        }

        // Pass model to view
        return View("Result", result);
    }
}


Create the Result View

@using ImageAnalyzer.Models
@model ResponseModel
@{
    ViewData["Title"] = "AI Result";
}
<div class="container mt-5">
    <div class="card shadow p-4">
        <h2 class="mb-4 text-center">AI Generated Caption & Alt Text</h2>
        @if(Model.Alts.Any())
        {
            <h3>Suggested Alt text</h3>
            <ul>
                @foreach(var alt in Model.Alts)
                {
                    <li>@alt</li>
                }
            </ul>
        }

        @if(Model.Captions.Any())
        {
            <h3>Suggested Captions</h3>
            <ul>
                @foreach(var caption in Model.Captions)
                {
                    <li>@caption</li>
                }
            </ul>
        }

        <a href="/image" class="btn btn-secondary mt-3">Analyze Another Image</a>
    </div>
</div>


SEO Benefits

Using AI-generated caption & alt text improves:

  • Google Image Search ranking
  • Accessibility score
  • User engagement
  • Localized content reach
  • Content creation time

Your editors no longer need to write text manually—AI does it instantly.

Conclusion
Using Gemini AI to automate alt text and image captions in ASP.NET Core MVC is:

  • Easy to execute
  • Excellent for SEO
  • Beneficial to accessibility
  • Very efficient in terms of time
  • Supports all required languages.

Your program may automatically describe every uploaded image with a few lines of code, adding richness, intelligence, and user-friendliness to the material.