The Ultimate AI & Engineering Development Environment on Windows

Unlocking AI on Windows

While Windows is the world's most popular desktop OS, it isn't traditionally considered the ideal platform for serious AI development out-of-the-box. Many core AI tools and libraries are built with a Linux-first mindset, making macOS and Linux distributions the typical choice for developers.

However, this guide changes that. By leveraging the Windows Subsystem for Linux (WSL), we can create a native, high-performance Linux environment directly within Windows. This approach avoids the overhead and clunky integration of traditional virtual machines, giving you the best of both worlds: the familiar Windows user experience and the raw power of a Linux development environment, with direct access to your PC's hardware like the GPU. This is the definitive way to get the most performance out of your Windows machine for AI.

Why Build AI Agents? Use Cases in Tech & Innovation

AI agents are more than just chatbots. They are autonomous systems capable of performing complex tasks, learning from data, and interacting with their environment. In technology and innovation organizations, they are becoming indispensable for:

Automated R&D: Agents can sift through thousands of research papers, patents, and datasets to identify trends, summarize findings, and suggest novel research directions.
Intelligent Code & DevOps: Agents can review code for bugs and style violations, automate complex deployment pipelines, and even write boilerplate code and unit tests.
Complex System Simulation: In engineering, agents can simulate fluid dynamics, material stress, or chemical reactions, allowing for rapid prototyping and optimization without costly physical experiments.
Proactive System Monitoring: Agents can monitor network traffic and system logs, predict potential failures, and automatically execute remediation steps to prevent downtime.

Part 1: Installing Windows Subsystem for Linux (WSL)

WSL lets you run a genuine Linux environment directly on Windows. The core system is small, but the Linux distribution (Ubuntu) will grow as you add tools.

Ubuntu Distribution

What it is: A full-featured Linux operating system running inside WSL.

Why you need it: It provides the foundational environment where all our development tools will be installed and run.

Estimated Disk Space: ~1-2 GB for the initial installation, growing as you add more tools.

Open PowerShell as Administrator:
Search for "PowerShell" in the Start Menu, right-click it, and select "Run as administrator."
Run the installation command:
In PowerShell, execute the following command. This single command handles enabling required features and installing the default Ubuntu distribution.
```
wsl --install
```
Restart your computer:
A restart is necessary to complete the installation. After rebooting, a terminal window will open automatically to set up your Linux username and password.

Part 2: Installing the Xfce Desktop Environment

Xfce + xrdp

What it is: Xfce is a lightweight desktop environment (GUI) for Linux. xrdp is a server that lets you connect to it using Windows' built-in Remote Desktop client.

Why you need it: To run graphical applications like plot viewers, simulators, or debuggers that have a user interface.

Estimated Disk Space: ~500 MB.

Update your package lists:
```
sudo apt update && sudo apt upgrade -y
```

Install Xfce and xrdp:

sudo apt install xfce4 xfce4-goodies xrdp -y

Configure xrdp:
Run these commands to set Xfce as the default session for xrdp and restart the service.
```
echo xfce4-session > ~/.xsession
```
```
sudo service xrdp restart
```
Find your WSL IP Address:
You'll need this IP to connect. Find it with this command and look for the address under `eth0`.
```
ip addr | grep 'inet '
```
Connect with Remote Desktop:
Open the "Remote Desktop Connection" app on Windows, enter the IP address, and log in with your Linux credentials.

Part 3: Installing Core Development Tools

Python & Virtual Environments

What it is: The primary programming language for AI/ML and a tool (`venv`) to create isolated project environments.

Why you need it: Python has a vast ecosystem of libraries. Virtual environments prevent package conflicts between different projects.

Estimated Disk Space: ~100-200 MB for Python and tools.

# Install Python 3, its package manager (pip), and venv
sudo apt install python3 python3-pip python3-venv -y

Essential Libraries for AI

Create and activate a virtual environment before installing libraries:

python3 -m venv my_ai_project
source my_ai_project/bin/activate

Now, inside your active environment (`(my_ai_project)`), install these key libraries:

# PyTorch (CPU): Deep learning framework (~2-3 GB)
pip install torch torchvision torchaudio

# TensorFlow (CPU): Alternative deep learning framework (~1 GB)
pip install tensorflow

# LangChain: Framework for building LLM applications (~50 MB)
pip install langchain

# Transformers: Library for pre-trained models from Hugging Face (~100 MB)
pip install transformers

Part 4: Engineering & Scientific Computing Tools

Anaconda Environment Manager

What it is: A popular distribution and environment manager for Python/R, focused on data science.

Why you need it: It simplifies the installation of complex scientific packages and their dependencies. It's a powerful alternative to `pip` and `venv`.

Estimated Disk Space: ~3-4 GB for the base installation.

Download and run the installer:
```
cd ~
wget https://repo.anaconda.com/archive/Anaconda3-2024.02-1-Linux-x86_64.sh
bash Anaconda3-2024.02-1-Linux-x86_64.sh
```
Note: Check the Anaconda archive for the latest version and update the filename.
Follow the on-screen prompts. It's recommended to accept the license terms and allow the installer to run `conda init`.
Activate the changes: Close and reopen your terminal, or run `source ~/.bashrc`. Your prompt should show `(base)`, indicating you are in the base conda environment.

Specialized Libraries (via Conda)

With Anaconda installed, you can easily create environments and install powerful engineering packages.

# Create a new environment for your work
conda create --name chem_env python=3.10 -y

# Activate the new environment
conda activate chem_env

# Install Cantera: For chemical kinetics & thermodynamics (~200 MB)
conda install -c cantera cantera -y

# Install Pyomo: For mathematical optimization (~50 MB)
conda install -c conda-forge pyomo -y

# Install Pymatgen: For materials analysis (~150 MB)
pip install pymatgen

Part 5: Mastering Your GitHub Workflow

GitHub CLI

What it is: The official command-line tool for GitHub.

Why you need it: It brings pull requests, issues, and other GitHub features to your terminal, speeding up your workflow significantly.

Estimated Disk Space: ~30-50 MB.

Install the GitHub CLI:

(type -p wget >/dev/null || (sudo apt update && sudo apt install wget -y)) \
&& sudo mkdir -p -m 755 /etc/apt/keyrings \
&& wget -qO- https://cli.github.com/packages/githubcli-archive-keyring.gpg | sudo tee /etc/apt/keyrings/githubcli-archive-keyring.gpg > /dev/null \
&& sudo chmod go+r /etc/apt/keyrings/githubcli-archive-keyring.gpg \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null \
&& sudo apt update \
&& sudo apt install gh -y

Authenticate:
```
gh auth login
```
Follow the on-screen prompts to log in via your web browser.

SSH Key Authentication

Setting up an SSH key allows you to connect to GitHub without typing your password every time. It's both more convenient and more secure.

Generate a new SSH key:
```
ssh-keygen -t ed25519 -C "your_email@example.com"
```
Press Enter to accept the default file location and optionally enter a passphrase for extra security.
Copy your public key:
```
cat ~/.ssh/id_ed25519.pub
```
Highlight and copy the entire output of this command.
Add the SSH key to your GitHub account: Go to Settings > SSH and GPG keys on GitHub, click "New SSH key", give it a title, and paste your key into the "Key" field.

VS Code Extensions for GitHub

Install these from the Extensions marketplace (Ctrl+Shift+X) to manage your entire workflow from the editor.

GitHub Pull Requests and Issues: Review and manage PRs and issues without leaving VS Code.
GitLens: Supercharges the built-in Git capabilities with code history, blame annotations, and more.
GitHub Copilot: The essential AI pair programmer for code completions and suggestions.

Part 6: GPU Acceleration (CUDA & ROCm)

Leveraging your GPU can speed up model training and inference by orders of magnitude. The setup depends on your GPU manufacturer.

For NVIDIA GPUs (CUDA)

Install NVIDIA Drivers on Windows: Ensure you have the latest "Game Ready" or "Studio" driver for your GPU installed on your Windows host system from the NVIDIA website.

Install the CUDA Toolkit in WSL: Run the following commands inside your Ubuntu terminal to install the CUDA toolkit.

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.4.1/local_installers/cuda-repo-wsl-ubuntu-12-4-1-local_12.4.1-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-12-4-1-local_12.4.1-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-12-4-1-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-4

Verify Installation: After installation, run `nvcc --version` to check that the CUDA compiler is installed correctly.

For AMD GPUs (ROCm)

Install AMD Drivers on Windows: Make sure you have the latest Adrenalin Edition drivers for your GPU from the AMD website.

Install ROCm in WSL: Run the following commands inside Ubuntu to add the AMD repository and install the ROCm toolkit.

sudo apt-get update
wget https://repo.radeon.com/amdgpu-install/6.1.2/ubuntu/jammy/amdgpu-install_6.1.60102-1_all.deb
sudo apt-get install ./amdgpu-install_6.1.60102-1_all.deb
sudo amdgpu-install --usecase=rocm --no-dkms

Verify Installation: After installation, run `rocminfo` to check if your GPU is recognized by the ROCm stack.

Part 7: Interacting with LLMs from the Terminal

Command-line interfaces (CLIs) are a powerful way to quickly interact with large language models for scripting and automation.

Google Gemini CLI

You can use Simon Willison's `llm` tool, which has a plugin for Gemini, for a rich CLI experience.

# Install the base tool and the Gemini plugin
pip install llm llm-gemini

# Set up your Gemini API key
llm keys set gemini
# Paste your API key when prompted

# Run a prompt
llm -m gemini-pro "Explain the theory of relativity in one sentence."

Anthropic Claude CLI

The same `llm` tool can be used for Claude with its official plugin.

# Install the Claude plugin
pip install llm-claude

# Set up your Claude API key
llm keys set claude
# Paste your API key when prompted

# Run a prompt against a Claude model
llm -m claude-3-opus "Write a haiku about software development."

Part 8: Getting Started - Tutorials & Resources

Now that your environment is set up, here are some excellent resources to begin your journey:

Official PyTorch Tutorials: From basics to advanced topics, the best place to start with PyTorch.
Official TensorFlow Tutorials: Comprehensive guides for getting started with TensorFlow and Keras.
LangChain Quickstart Guide: Learn the core concepts of building applications with LLMs.
Hugging Face Transformers Quick Tour: A fast-paced introduction to using pre-trained models for various NLP tasks.
Cantera Python Examples: Practical examples for combustion, thermodynamics, and kinetics simulations.
Materials Project Workshops (Pymatgen): A collection of workshops and notebooks for learning materials informatics.
GitHub Copilot Documentation: Learn how to get the most out of your AI pair programmer.

Part 9: Use Cases in Action: Your First Agents

Let's put your new environment to work with some agentic examples. These will show you the basic patterns for building agents that can use tools and collaborate.

Example 1: A Simple Calculator Agent

This agent uses a math tool to solve word problems. It demonstrates how an LLM can reason about when to use a tool and how to use its output.

Install packages:
```
pip install langchain langchain-openai
```

Create and run `calculator_agent.py`:

import os
from langchain_openai import ChatOpenAI
from langchain.agents import load_tools, initialize_agent, AgentType

# Set your OpenAI key as an environment variable
# In your terminal: export OPENAI_API_KEY='your-key-here'
if "OPENAI_API_KEY" not in os.environ:
    print("Please set the OPENAI_API_KEY environment variable.")
else:
    llm = ChatOpenAI(temperature=0)
    tools = load_tools(["llm-math"], llm=llm)
    agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
    question = "If I have 4 apples and I buy 3 more boxes of apples, with each box containing 5 apples, how many apples do I have in total?"
    response = agent.run(question)
    print(f"\nFinal Answer: {response}")

Example 2: A Web Research Agent

This agent uses the Tavily Search API to answer questions about recent events, demonstrating access to external knowledge.

Get a Tavily API Key: Go to tavily.com and sign up for a free API key.

Install packages:

pip install langchain-community tavily-python

Create and run `research_agent.py`:

import os
from langchain_openai import ChatOpenAI
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain.agents import initialize_agent, AgentType

# Set your API keys as environment variables
# export TAVILY_API_KEY='your-key-here'
# export OPENAI_API_KEY='your-key-here'
if "TAVILY_API_KEY" not in os.environ or "OPENAI_API_KEY" not in os.environ:
    print("Please set TAVILY_API_KEY and OPENAI_API_KEY environment variables.")
else:
    llm = ChatOpenAI(temperature=0, model_name="gpt-4")
    tools = [TavilySearchResults()]
    agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
    response = agent.run("Who won the last Formula 1 world championship and what was the final points difference?")
    print(f"\nFinal Answer: {response}")

Example 3: Multi-Agent Research Team (with Gemini & LangGraph)

This advanced example shows two agents collaborating. A Researcher finds information, and a Writer drafts a blog post. This pattern is the foundation for more complex autonomous systems.

Install packages:

pip install langchain-google-genai langgraph tavily-python

Create and run `multi_agent_team.py`:

import os
from langchain_core.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI
from langgraph.graph import StateGraph, END
from langchain_community.tools.tavily_search import TavilySearchResults
from typing import TypedDict, Annotated, List
import operator

# Set API keys as environment variables
# export GOOGLE_API_KEY='your-key-here'
# export TAVILY_API_KEY='your-key-here'
if "GOOGLE_API_KEY" not in os.environ or "TAVILY_API_KEY" not in os.environ:
    print("Please set GOOGLE_API_KEY and TAVILY_API_KEY environment variables.")
else:
    tavily_tool = TavilySearchResults(max_results=4)
    llm = ChatGoogleGenerativeAI(model="gemini-1.5-pro-latest")

    class AgentState(TypedDict): messages: Annotated[List, operator.add]
    def create_agent(llm, tools, system_prompt: str): return llm.bind_tools(tools)
    def agent_node(state, agent, name): return {"messages": [agent.invoke(state['messages'])]}

    researcher_agent = create_agent(llm, [tavily_tool], "You are a research assistant. Use the search tool to find information.")
    researcher_node = lambda state: agent_node(state, researcher_agent, "Researcher")
    writer_agent = create_agent(llm, [], "You are a blog post writer. Write a compelling blog post based on the provided research notes.")
    writer_node = lambda state: agent_node(state, writer_agent, "Writer")

    workflow = StateGraph(AgentState)
    workflow.add_node("Researcher", researcher_node)
    workflow.add_node("Writer", writer_node)
    workflow.set_entry_point("Researcher")
    workflow.add_edge("Researcher", "Writer")
    workflow.add_edge("Writer", END)
    graph = workflow.compile()

    topic = "Why is LangGraph a powerful framework for building multi-agent systems?"
    events = graph.stream({"messages": [HumanMessage(content=topic)]})
    print(f"--- Starting Research on: {topic} ---\n")
    for event in events:
        for key, value in event.items():
            if key != "__end__":
                print(f"--- Output from: {key} ---")
                print(value['messages'][-1].content)
                print("-" * 40)

Example 4: Personal Finance Analyst Agent (with Gemini)

This powerful example shows how to create an agent that can analyze your personal data locally. The agent uses the Gemini model to understand your questions and translates them into Python code to analyze a CSV file of your financial transactions.

Install necessary packages:

pip install pandas langchain-experimental langchain-google-genai

Prepare your data: Create a file named `transactions.csv` in the same directory as your script with some sample financial data like this:

Date,Description,Category,Amount
2024-06-01,Starbucks,Coffee,5.75
2024-06-01,Shell,Gas,55.43
2024-06-02,Netflix,Entertainment,15.99
2024-06-03,Whole Foods,Groceries,124.30
2024-06-04,Starbucks,Coffee,6.25
2024-06-05,Amazon,Shopping,45.50
2024-06-07,Gym Membership,Health,40.00
2024-06-10,Starbucks,Coffee,5.75
2024-06-12,Whole Foods,Groceries,88.15
2024-06-15,Movie Ticket,Entertainment,25.00
2024-06-20,Savings Transfer,Savings,500.00
2024-06-25,Edison,Utilities,85.60

Create and run `finance_agent.py`: This script will load your CSV and let you ask questions about it.

import os
import pandas as pd
from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent
from langchain_google_genai import ChatGoogleGenerativeAI

# Set your Google API key as an environment variable
# export GOOGLE_API_KEY='your-key-here'
if "GOOGLE_API_KEY" not in os.environ:
    print("Please set the GOOGLE_API_KEY environment variable.")
else:
    try:
        df = pd.read_csv("transactions.csv")
    except FileNotFoundError:
        print("Error: transactions.csv not found. Please create it in the same directory.")
        exit()

    llm = ChatGoogleGenerativeAI(model="gemini-1.5-pro-latest", temperature=0)
    agent = create_pandas_dataframe_agent(llm, df, agent_type="zero-shot-react-description", verbose=True)

    print("--- Personal Finance Agent ---")
    print("Ask me questions about your finances (e.g., 'How much did I spend on coffee?'). Type 'exit' to quit.")

    while True:
        user_question = input("> ")
        if user_question.lower() == 'exit':
            break
        
        # Example questions to try:
        # - How much did I spend on groceries?
        # - What were my top 3 spending categories?
        # - Based on my spending, suggest one area where I can save money.
        
        response = agent.run(user_question)
        print(f"Agent: {response}")

Example 5: Fine-Tuning a Small LLM on Custom Data (with PyTorch & GPU)

This is the ultimate step: creating a specialized model. We will fine-tune a small, pre-trained model from Hugging Face on our own data. This teaches the model a new skill or personality. The Hugging Face `Trainer` API will automatically use your CUDA or ROCm GPU for a massive speedup.

Install necessary packages:

pip install transformers datasets accelerate bitsandbytes torch

Prepare your data: Create a file named `pirate_qa.json` to teach the model to speak like a pirate.

[
    {"text": "<s>[INST] Hello, who are you? [/INST] Ahoy! I be Captain Code, the scurviest pirate in these digital seas. </s>"},
    {"text": "<s>[INST] What is Python? [/INST] Yarrr, Python be a mighty serpent of a language, good for plunderin' data and buildin' treasures. </s>"},
    {"text": "<s>[INST] How does a computer work? [/INST] Shiver me timbers! It be magic smoke and lightning trapped in a box, doin' yer bidding. </s>"}
]

Create and run `finetune_pirate_model.py`: This script will load the base model, train it on your JSON data, and save the new, specialized model.

import torch
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArguments, Trainer, pipeline

# 1. Load Model and Tokenizer
model_name = "distilgpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token # Set padding token
model = AutoModelForCausalLM.from_pretrained(model_name)

# 2. Load and Prepare Dataset
dataset = load_dataset("json", data_files="pirate_qa.json", split="train")

def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=128)

tokenized_datasets = dataset.map(tokenize_function, batched=True, remove_columns=["text"])

# 3. Set up Training Arguments
training_args = TrainingArguments(
    output_dir="./pirate-gpt2",
    num_train_epochs=3,
    per_device_train_batch_size=2,
    warmup_steps=50,
    weight_decay=0.01,
    logging_dir='./logs',
    logging_steps=1,
    # This tells the trainer to use the GPU if available
    use_cpu=not torch.cuda.is_available() 
)

# 4. Create Trainer and Fine-Tune
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets,
)

print("--- Starting Fine-Tuning ---")
trainer.train()
print("--- Fine-Tuning Complete ---")

# 5. Save the model and tokenizer
trainer.save_model("./pirate-gpt2")
tokenizer.save_pretrained("./pirate-gpt2")

# 6. Test the new model
print("\n--- Testing the Fine-Tuned Pirate Model ---")
pirate_pipeline = pipeline('text-generation', model='./pirate-gpt2', tokenizer='./pirate-gpt2')
prompt = "<s>[INST] What is machine learning? [/INST]"
result = pirate_pipeline(prompt, max_length=50)
print(result[0]['generated_text'])

When you run this script, watch the terminal output. The `Trainer` will print a table showing the training progress (loss, learning rate, etc.). This process will be significantly faster on a GPU than on a CPU. The final output should be a pirate-themed answer to the question!

Part 10: Verifying Your Installation

Run these commands in your WSL terminal to ensure everything is installed and configured correctly.

System & Environment Checks

# Check WSL version details
wsl -l -v

# Check Python version
python3 --version

# Check Conda version (if installed)
conda --version

# Check Git version
git --version

# Check GitHub CLI version
gh --version

GPU Acceleration Checks

# For NVIDIA GPUs, should display driver and CUDA version
nvidia-smi

# For AMD GPUs, should display GPU info
rocminfo

AI & Scientific Library Checks

Activate one of your Python environments (e.g., `source my_ai_project/bin/activate` or `conda activate chem_env`) and run this Python script to test imports.

# Save as test_imports.py and run with `python test_imports.py`
try:
    import torch
    print(f"PyTorch version: {torch.__version__}")
    if torch.cuda.is_available():
        print(f"PyTorch CUDA available: {torch.cuda.get_device_name(0)}")
    else:
        print("PyTorch CUDA not available.")
        
    import tensorflow as tf
    print(f"TensorFlow version: {tf.__version__}")
    print(f"TensorFlow found GPUs: {len(tf.config.list_physical_devices('GPU'))}")

    import langchain
    print(f"LangChain version: {langchain.__version__}")

    import cantera as ct
    print(f"Cantera version: {ct.__version__}")

    print("\nAll major libraries imported successfully!")

except ImportError as e:
    print(f"An error occurred: {e}")

What's Next?

Congratulations! You now have a professional-grade development environment. You've installed the tools, understand the basic concepts, and have run your first agents.

The journey doesn't end here. The best way to learn is by building. Pick a project that interests you—automate a tedious task, build a research tool, or create a complex simulation. Use the tutorials in Part 8 as a starting point and explore the vast possibilities of your new toolkit.