String Utilities

GitLab Docs Pipeline Coverage
This module contains utilities around manipulating and working with strings and readable output, and helping with CLI input and output. See for procedures on releases.

Core Utilities

  • ConsoleUtil: Contains a few methods related to the visibility of characters on the console.
  • EscapeUtil: Escape and unescape strings using the same escape sequence as strings in java code.
  • NamingUtil: Reformat names using naming rules.
  • ReaderUtil: Utilities to read or skip content from Reader.
  • StringUtil: Get properties of strings and modify strings using extra utilities from the library. Also has util to make consistent string formatting of any object.

And interfaces

  • Displayable: Simple interface with a displayString() method, and utility methods to make readable strings of standard java utility classes.
  • Stringable: Simple interface with an asString() method, and utility methods to make to-string like strings from standard java types.


  • Char, and implementations Color, Control, Unicode: These are wrappers around single keystrokes, control sequences, terminal colors and unicode chars. When handling terminal input, these can represent a keystroke each, or when updating a terminal also control visible colors, move cursor around etc.
  • CharReader: A reader that can reads Char objects form an input stream.
  • CharStream (and CharSplitterator): Makes a stream of chars from a string or input stream.
  • CharUtil: Utilities making chars from meaningful input, or bytes from list of chars.
  • CharSlice: Make an immutable sliced view of a char sequence. Operates as a CharSequence, but unlike a string, will never copy the underlying data on view operations.


  • DiffStringUtil: Utilities used when handling diffs. Splitting by line into CharSlice, and prefix, suffix and overlap comparisons of char sequences.
  • PatchUtil: Get line-by-line diff of two strings, and make parch strings of a list of changes.


  • GSMCharset: Charset used in GSM (mobile) encoding. Uses 7 bits per byte in encoding, so can be bit-packed after encoding.
  • T61Charset: Charset used in TELEX (old terminal control exchange format). Is mostly a subset of ASCII plus it's own extended characters.
  • TBCDCharset: Charset used in SS7 and MAP messages (core telco systems used by GSM systems since 1986). Encodes number sequences plus * and # and the letters a-c. Has special variant for handling odd number of digits.


  • LineBufferedReader: Read from a sub-reader, but only buffering one line at a time. Will never read the next line until required.
  • Utf8Stream(Reader|Writer): Read or write characters to a stream using proper UTF-8 encoding, meaning UTF-16 / USC2 combined characters are properly re-encoded to UTF-8 sequences. Also does no caching except for handling the extended unicode chars.
  • IndentedPrintWriter: Write while remembering and applying ongoing indent. Can stack indents so as to generate a properly indented string based on simpler code.