The Evolution of Cursor Rules: A More Granular Approach to AI Guidance

Exploring Ideas: A Blog on Technology, Startups, Food, and More

In our previous post about AI’s impact on software development, we discussed how .cursorrules files were becoming an integral part of modern codebases. Since then, Cursor has evolved its approach to AI guidance, introducing a more powerful and flexible system: Project Rules.

The Shift from Single File to Directory-Based Rules

While the .cursorrules file served us well as a simple way to provide AI guidance, it had limitations. All rules had to live in a single file, making it difficult to organize complex projects with different requirements for different parts of the codebase. Cursor’s new approach moves away from this monolithic structure to a more granular, directory-based system.

The new system stores rules in the .cursor/rules directory, offering several key advantages:

Folder-Specific Rules: Different parts of your project can have different rules
Pattern Matching: Rules can be applied based on file patterns
Automatic Attachment: Relevant rules are included when matching files are referenced
Better Organization: Rules can be split into logical groups

Real-World Examples

Let’s look at some practical examples of how to structure rules in this new system:

Data Science Module Rules

## .cursor/rules/data-science.yaml
description: "Rules for data science module development"
patterns: ["src/models/**/*.py", "notebooks/**/*.ipynb"]
instructions: """
When working with data science modules:
- Use type hints for all function parameters and return values
- Include docstrings with numpy style documentation
- Follow scikit-learn's fit/transform pattern for transformers
- Implement proper data validation in __init__
- Use dataclasses for model configuration
- Include logging for training and inference
- Cache heavy computations using functools.lru_cache

Example structure:
```python
from dataclasses import dataclass
from typing import Optional, List, Union
import numpy as np
import pandas as pd
from sklearn.base import BaseEstimator, TransformerMixin

@dataclass
class ModelConfig:
    n_components: int
    random_state: Optional[int] = None
    
class CustomTransformer(BaseEstimator, TransformerMixin):
    """Custom transformation for feature engineering.
    
    Parameters
    ----------
    config : ModelConfig
        Configuration parameters for the transformer
        
    Attributes
    ----------
    is_fitted_ : bool
        Flag indicating if the transformer is fitted
    """
    
    def __init__(self, config: ModelConfig):
        self.config = config
        self.is_fitted_ = False
        
    def fit(self, X: pd.DataFrame, y: Optional[pd.Series] = None) -> 'CustomTransformer':
        # Validation
        if not isinstance(X, pd.DataFrame):
            raise TypeError("X must be a pandas DataFrame")
            
        # Fit logic here
        self.is_fitted_ = True
        return self
        
    def transform(self, X: pd.DataFrame) -> pd.DataFrame:
        # Transform logic here
        return X

FastAPI Web Service Rules

## .cursor/rules/fastapi-services.yaml
description: "Guidelines for FastAPI service implementations"
patterns: ["src/api/**/*.py"]
instructions: """
For FastAPI service implementations:
- Use Pydantic models for request/response validation
- Implement proper dependency injection
- Include OpenAPI documentation
- Add proper error handling with HTTP status codes
- Use environment variables for configuration
- Implement rate limiting for public endpoints
- Add proper logging and monitoring

Example pattern:
```python
from typing import List, Optional
from fastapi import FastAPI, HTTPException, Depends
from pydantic import BaseModel, Field
from loguru import logger
import os

class PredictionRequest(BaseModel):
    features: List[float] = Field(..., description="Input features for prediction")
    model_version: Optional[str] = Field(None, description="Model version to use")
    
    class Config:
        schema_extra = {
            "example": {
                "features": [1.0, 2.0, 3.0],
                "model_version": "v1"
            }
        }

class PredictionResponse(BaseModel):
    prediction: float
    confidence: float
    model_version: str

async def get_model():
    # Dependency injection for model loading
    try:
        model = load_model(os.getenv("MODEL_PATH"))
        return model
    except Exception as e:
        logger.error(f"Failed to load model: {e}")
        raise HTTPException(status_code=500, detail="Model loading failed")

@app.post("/predict", response_model=PredictionResponse)
async def predict(
    request: PredictionRequest,
    model = Depends(get_model)
):
    try:
        prediction = model.predict(request.features)
        return PredictionResponse(
            prediction=float(prediction),
            confidence=0.95,  # Example
            model_version=model.version
        )
    except Exception as e:
        logger.error(f"Prediction failed: {e}")
        raise HTTPException(status_code=500, detail="Prediction failed")

Testing Standards

## .cursor/rules/testing.yaml
description: "Testing standards and patterns"
patterns: ["tests/**/*.py"]
instructions: """
Testing guidelines:
- Use pytest as the testing framework
- Follow the Arrange-Act-Assert pattern
- Use fixtures for common setup
- Mock external dependencies with pytest-mock
- Test edge cases and error scenarios
- Use parametrize for multiple test cases
- Aim for 80% coverage minimum

Example test structure:
```python
import pytest
from unittest.mock import Mock
import pandas as pd
import numpy as np

@pytest.fixture
def sample_data():
    return pd.DataFrame({
        'feature1': np.random.randn(100),
        'feature2': np.random.randn(100)
    })

@pytest.fixture
def model_config():
    return ModelConfig(n_components=2, random_state=42)

def test_transformer_validation(sample_data, model_config):
    # Arrange
    transformer = CustomTransformer(model_config)
    
    # Act & Assert
    with pytest.raises(TypeError, match="X must be a pandas DataFrame"):
        transformer.fit(sample_data.values)  # Passing numpy array instead of DataFrame

@pytest.mark.parametrize("n_components,expected_shape", [
    (2, (100, 2)),
    (3, (100, 3)),
])
def test_transformer_output_shape(sample_data, n_components, expected_shape):
    # Arrange
    config = ModelConfig(n_components=n_components)
    transformer = CustomTransformer(config)
    
    # Act
    result = transformer.fit_transform(sample_data)
    
    # Assert
    assert result.shape == expected_shape

Benefits of the New Approach

The directory-based approach offers several advantages over the single .cursorrules file:

Better Organization: Rules are logically grouped and easier to maintain
Granular Control: Apply specific rules to specific parts of your codebase
Easier Collaboration: Team members can work on different rules without conflicts
Improved Clarity: Each rule file can focus on a specific aspect of development
Version Control Friendly: Changes to rules are more trackable and reviewable

Migrating from .cursorrules

If you’re still using a .cursorrules file, Cursor will continue to support it for backward compatibility. But they recommend migrating to the new system for better flexibility and control. Here’s a simple migration process:

Create a .cursor/rules directory in your project root
Break down your existing rules into logical groups
Create separate YAML files for each group
Add appropriate pattern matching to each rule file
Remove the old .cursorrules file once migration is complete

Looking Forward

This evolution in Cursor’s rules system reflects a broader trend in AI-assisted development: the need for more precise and contextual guidance. As AI becomes more sophisticated, our ability to provide nuanced instructions becomes increasingly important.

I could see these sorts of rules becoming shared tools that one might install in a repo, or a team might share across all their repos. Frameworks themselves might ship with rules: imagine starting a new Django project and having a set of rules to follow to get idiomatic code generated.

The new system allows us to create more maintainable and scalable codebases while ensuring AI assistance remains helpful and contextually aware. It’s a significant step forward in making AI a more effective partner in the development process.

What are your thoughts on Cursor’s new rules system? Share your experiences or questions on Twitter/X!

Subscribe to the Newsletter

Get the latest posts and insights delivered straight to your inbox.