uutils_
Uutils at GSOC
uutils gsoc --help
$ uutils gsoc --help

Google Summer of Code is a global, online program focused on bringing new contributors into open source software development. GSoC contributors work with an open source organization on a 12+ week programming project under the guidance of mentors.

If you want to know more about how it works, check out the links below.

Useful links:

What is it about?

The uutils project is aiming at rewriting key Linux utilities in Rust, targeting coreutils, findutils, diffutils, sed, grep, awk, procps, util-linux, and bsdutils. Their goal is to create fully compatible, high-performance drop-in replacements, ensuring reliability through upstream test suites. coreutils is already production-ready and shipping in distributions; findutils, diffutils and grep are well advanced; while sed, awk and the system utilities (procps, util-linux, bsdutils) are at earlier stages of development.

How to get started

Here are some steps to follow if you want to apply for a GSOC project with uutils.

  1. Check the requirements. You have to meet Google's requirements to apply. Specifically for uutils, it's best if you at least know some Rust and have some familiarity with using the coreutils and the other tools.

  2. Reach out to us! We are happy to discuss potential projects and help you find a meaningful project for uutils. Tell us what interests you about the project and what experience you have and we can find a suitable project together. You can talk to the uutils maintainers on the Discord server. In particular, you can contact:

    • Sylvestre Ledru (@sylvestre on GitHub and Discord)
  3. Get comfortable with uutils. To find a good project you need to understand the codebase. We recommend that you take a look at the code, the issue tracker and maybe try to tackle some good-first-issues across any of our projects. Also take a look at our contributor guidelines.

  4. Find a project and a mentor. We have a list of potential projects you can adapt or use as inspiration. Make sure discuss your ideas with the maintainers! Some project ideas below have suggested mentors you could contact.

  5. Write the application. You can do this with your mentor. The application has to go through Google, so make sure to follow all the advice in Google's Contributor Guide. Please make sure you include your prior contributions to uutils in your application.

Tips

Project Ideas

These are starting points for a Google Summer of Code project with uutils. Feel free to adapt one or propose your own, and see the guidelines for the project list. Each idea lists its difficulty, an estimated size (~175 or ~350 hours) and a suggested mentor where one is available.

Performance optimization for coreutils

Medium ~175h Mentor: TBD

uutils/coreutils has strong GNU compatibility, but some utilities can still be made faster. Systematically profile, benchmark and optimize the hot paths so they match or beat GNU coreutils.

  • Profile utilities with perf, flamegraph and criterion
  • Build a benchmark suite comparing against GNU coreutils
  • Optimize hot paths in cat, cut, sort, uniq, wc, etc.
  • Reduce allocations, improve buffering, use SIMD where it helps

Skills: Rust, performance profiling, systems/I/O optimization; SIMD a plus.

Expand differential fuzzing for coreutils

Medium ~175h Mentor: TBD

coreutils has some fuzzing infrastructure, but many utilities lack coverage. Expand differential fuzzing that compares uutils against GNU to catch discrepancies automatically.

  • Add fuzz targets for utilities that currently lack them
  • Build differential harnesses comparing uutils vs GNU output
  • Run campaigns with AFL++ and libFuzzer, wire them into CI
  • Triage and fix the bugs the fuzzers find

Skills: Rust, fuzzing tools (AFL++, libFuzzer, cargo-fuzz), differential testing.

Complete findutils GNU compatibility

Medium ~175h Mentor: Sylvestre

uutils/findutils already passes more than half of the GNU findutils and BFS tests. Finish the remaining work to reach full compatibility and production readiness.

  • Implement missing options and predicates for find
  • Fix edge cases in traversal and symlink handling
  • Complete xargs argument handling
  • Pass the remaining GNU tests; add differential fuzzing

Skills: Rust, filesystem operations, find/xargs usage; fuzzing a plus.

Complete diffutils GNU compatibility

Medium ~175h Mentor: TBD

uutils/diffutils implements diff, diff3, cmp and sdiff. Complete the remaining features and edge cases so it passes the GNU test suite.

  • Implement missing options and output formats for diff
  • Improve algorithm efficiency for large files
  • Complete diff3 three-way merges
  • Pass the GNU diffutils tests; add differential fuzzing

Skills: Rust, diff algorithms (Myers, Patience), text processing.

Complete the Rust implementation of sed

Medium ~175h Mentor: TBD

uutils/sed has been started but needs significant work to fully match GNU sed and POSIX. Implement the missing commands and edge cases and make it pass the GNU test suite.

  • Implement missing commands and addressing flags
  • Handle complex regex, backreferences and multi-line pattern space
  • Implement hold-space operations correctly
  • Pass the GNU sed tests; add differential fuzzing

Skills: Rust, regular expressions, sed scripting, text processing.

Rust implementation of grep

Hard ~350h Mentor: TBD

Build a high-performance, feature-complete drop-in replacement for GNU grep - full command-line interface, output modes and edge-case behavior, with the performance Rust can provide.

  • Implement the full POSIX/GNU grep CLI
  • Support BRE, ERE and PCRE patterns
  • Handle context lines, recursive search, binary and compressed files
  • Pass the GNU grep tests; benchmark against GNU grep

Skills: Rust, regex engines, performance optimization, I/O.

Rust implementation of awk

Hard ~350h Mentor: TBD

Implement awk, a complete programming language for pattern scanning and text processing, targeting POSIX awk and GNU awk (gawk) extensions.

  • Build the lexer, parser and pattern-action execution engine
  • Support built-in variables (NR, NF, FS, RS) and functions
  • Implement field splitting, arrays, user functions and control flow
  • Set up GNU test suite execution for validation

Skills: Rust, lexers/parsers/interpreters, regex; language implementation a plus.

Complete procps implementation

Medium ~350h Mentor: TBD

uutils/procps reimplements the process and system monitoring tools. Complete the core utilities and reach production readiness with full GNU compatibility. Scope can be focused on one of the groups below.

  • Process management & info: ps, pgrep, pidwait, pkill, skill, snice
  • System monitoring & statistics: top, vmstat, tload, w, watch
  • Memory & resource analysis: pmap, slabtop, free, uptime
  • Robust /proc parsing across kernels; run the GNU procps tests

Skills: Rust, Linux /proc filesystem, process management and monitoring.

Complete util-linux implementation

Medium ~350h Mentor: TBD

uutils/util-linux reimplements essential system utilities. Complete the most commonly used tools and reach production-ready status with full compatibility. Scope can be focused on one of the groups below.

  • Essential system utilities: dmesg, lscpu, lsipc, lslocks, lsmem, lsns, mount, umount
  • Process & resource management: chrt, ionice, kill, renice, prlimit, taskset, runuser
  • User & session management: su, agetty, last, lslogins, mesg, setsid, setterm
  • Run the GNU util-linux tests; man-page compatibility

Skills: Rust, Linux system calls and kernel interfaces, system administration.

Complete bsdutils implementation

Medium ~175h Mentor: TBD

uutils/bsdutils reimplements BSD-origin utilities found on Linux. Complete the core tools with compatibility across BSD and GNU/Linux variants.

  • Complete logger, column, hexdump, look and friends
  • Terminal session recording: script, scriptlive, scriptreplay
  • Handle cross-platform differences and portability
  • Set up test suites for both BSD and GNU variants

Skills: Rust, BSD and Linux environments, terminal emulation, cross-platform development.

Localization

Hard Size: TBD Mentor: TBD

Support localization for formatting, quoting and sorting in utilities like date, ls and sort. The core question is how to deal with locale data: the all-Rust icu4x library (possibly with a custom localedef-like command) versus a wrapper around the C icu library.

Skills: Rust, Unicode/locale handling.

Code refactoring for procps, util-linux & bsdutils

Medium ~175h Mentor: Sylvestre

Refactor the Rust versions of procps, util-linux and bsdutils to reduce duplication, particularly around uudoc, the test framework, and single/multicall binary support.

  • Eliminate duplicated code across the three projects
  • Unify the documentation and test scaffolding
  • Improve maintainability and shared infrastructure

Skills: Rust, Linux utilities, code optimization and refactoring.

A multicall binary and core library for findutils

Medium ~175h Mentor: TBD

findutils currently consists of a few unconnected binaries. Build a multicall binary (like coreutils) and a library of shared functions (like uucore).

  • Design a unified multicall entry point
  • Extract shared functionality into a core library
  • Consider sharing code between coreutils and findutils

Skills: Rust, library and CLI design.

GNU test execution for procps, util-linux, diffutils & bsdutils

Medium ~175h Mentor: Sylvestre

Integrate the upstream test suites against the Rust versions of procps, util-linux, diffutils and bsdutils - crucial for proving drop-in compatibility. We already do this for coreutils via GitHub Actions, a build script and a run script.

  • Adapt the coreutils CI approach to each project
  • Wire the test suites into GitHub Actions
  • Track and report compatibility over time

Skills: GitHub Actions, Rust, GNU testing methodologies, Linux utilities.

Symbolic/fuzz testing and formal verification of tool grammars

Mixed Size: Mixed Mentor: TBD (informally @chadbrewbaker)

Inspired by lightweight formal methods at AWS; most KLEE scaffolding was done for KLEE 2021. Start with wc, formalize its command-line grammar, run it under AFL++ and KLEE, then generalize.

  • Add proofs of resource use and correctness, especially around syscalls and memory
  • Unify fuzzer and KLEE seeds so they help each other find paths
  • Formalize the wc UTF-8 inner loop into a SIMD-friendly automaton / monoid
  • Automate detection of accidentally quadratic behavior

Skills: Rust, KLEE, AFL++, SMT solvers (Z3, CVC5), TLA+/Alloy, grammar testing.

Official Redox support

Medium ~175h Mentor: TBD

We want to support the Redox operating system but are not actively testing against it. Since the last round of fixes in #2550, regressions have likely crept in.

  • Set up Redox in CI
  • Fix the issues that arise and port missing features

Skills: Rust, cross-platform/OS development.