yek
A fast Rust based tool to read text-based files in a repository or directory, chunk them, and serialize them for LLM consumption. By default, the tool:
- Uses
.gitignore
rules to skip unwanted files. - Uses the Git history to infer what files are important.
- Infers additional ignore patterns (binary, large, etc.).
- Splits content into chunks based on either approximate “token” count or byte size.
- Automatically detects if output is being piped and streams content instead of writing to files.
- Supports processing multiple directories in a single command.
- Configurable via a
yek.toml
file.
Yek ÙŠÚ© means “One” in Farsi/Persian.
Consider having a simple repo like this:
.
├── README.md
├── src
│ ├── main.rs
│ └── utils.rs
└── tests
└── test.rs
Running yek
in this directory will produce a single file and write it to the temp directory with the following content:
Note
yek
will prioritize more important files to come last in the output. This is useful for LLM consumption since LLMs tend to pay more attention to content that appears later in the context.
Installation
For Unix-like systems (macOS, Linux):
curl -fsSL https://bodo.run/yek.sh | bash
For Windows (PowerShell):
irm https://bodo.run/yek.ps1 | iex
or build from source
- Install Rust.
- Clone this repository.
- Run
make macos
ormake linux
to build for your platform (both runcargo build --release
). - Add to your PATH:
export PATH=$(pwd)/target/release:$PATH
Usage
yek
has sensible defaults, you can simply run yek
in a directory to serialize the entire repository. It will serialize all files in the repository into chunks of 10MB by default. The file will be written to the temp directory and file path will be printed to the console.
Examples
Process current directory and write to temp directory:
Pipe output to clipboard (macOS):
Cap the max size to 128K tokens and only process the src
directory:
yek --max-size 128K --tokens src/
Note
When multiple chunks are written, the last chunk will contain the highest-priority files.
Cap the max size to 100KB and only process the src
directory, writing to a specific directory:
yek --max-size 100KB --output-dir /tmp/yek src/
Process multiple directories:
CLI Reference
Configuration File
You can place a file called yek.toml
at your project root or pass a custom path via --config
. The configuration file allows you to:
- Add custom ignore patterns
- Define file priority rules for processing order
- Add additional binary file extensions to ignore (extends the built-in list)
- Configure Git-based priority boost
Example yek.toml
This is optional, you can configure the yek.toml
file at the root of your project.
# Add patterns to ignore (in addition to .gitignore) ignore_patterns = [ "node_modules/", "\.next/", "my_custom_folder/" ] # Configure Git-based priority boost (optional) git_boost_max = 50 # Maximum score boost based on Git history (default: 100) # Define priority rules for processing order # Higher scores are processed first [[priority_rules]] score = 100 pattern = "^src/lib/" [[priority_rules]] score = 90 pattern = "^src/" [[priority_rules]] score = 80 pattern = "^docs/" # Add additional binary file extensions to ignore # These extend the built-in list (.jpg, .png, .exe, etc.) binary_extensions = [ ".blend", # Blender files ".fbx", # 3D model files ".max", # 3ds Max files ".psd", # Photoshop files ]
All configuration keys are optional. By default:
- No extra ignore patterns
- All files have equal priority (score: 1)
- Git-based priority boost maximum is 100
- Common binary file extensions are ignored (.jpg, .png, .exe, etc. – see source for full list)
Performance
yek
is fast. It’s written in Rust and does many things in parallel to speed up processing.
Here is a benchmark comparing it to Repomix serializing the Next.js project:
time yek Executed in 5.19 secs fish external usr time 2.85 secs 54.00 micros 2.85 secs sys time 6.31 secs 629.00 micros 6.31 secs
time repomix Executed in 22.24 mins fish external usr time 21.99 mins 0.18 millis 21.99 mins sys time 0.23 mins 1.72 millis 0.23 mins
yek
is 230x faster than repomix
.
Roadmap
See proposed features. I am open to accepting new feature requests. Please write a detailed proposal to discuss new features.
License
MIT