String Utilities
This module contains utilities around manipulating and working with
strings and readable output, and helping with CLI input and output.
See morimekta.net/utils for procedures on releases.
Getting Started
To add to maven:
<dependency>
<groupId>net.morimekta.utils</groupId>
<artifactId>strings</artifactId>
<version>4.6.2</version>
</dependency>
To add to gradle:
implementation 'net.morimekta.utils:strings:4.6.2'
Core Utilities
ConsoleUtil: Contains a few methods related to the visibility of characters on the console.EscapeUtil: Escape and unescape strings using the same escape sequence as strings in java code.NamingUtil: Reformat names using naming rules.ReaderUtil: Utilities to read or skip content fromReader.StringUtil: Get properties of strings and modify strings using extra utilities from the library. Also has util to make consistent string formatting of any object.
And interfaces
Displayable: Simple interface with adisplayString()method, and utility methods to make readable strings of standard java utility classes.Stringable: Simple interface with anasString()method, and utility methods to make to-string like strings from standard java types.
Character
Char, and implementationsColor,Control,Unicode: These are wrappers around single keystrokes, control sequences, terminal colors and unicode chars. When handling terminal input, these can represent a keystroke each, or when updating a terminal also control visible colors, move cursor around etc.CharReader: A reader that can readsCharobjects form an input stream.CharStream(andCharSplitterator): Makes a stream of chars from a string or input stream.CharUtil: Utilities making chars from meaningful input, or bytes from list of chars.CharSlice: Make an immutable sliced view of a char sequence. Operates as aCharSequence, but unlike a string, will never copy the underlying data on view operations.
Diff
DiffStringUtil: Utilities used when handling diffs. Splitting by line intoCharSlice, and prefix, suffix and overlap comparisons of char sequences.PatchUtil: Get line-by-line diff of two strings, and make parch strings of a list of changes.
Encoding
GSMCharset: Charset used in GSM (mobile) encoding. Uses 7 bits per byte in encoding, so can be bit-packed after encoding.T61Charset: Charset used in TELEX (old terminal control exchange format). Is mostly a subset of ASCII plus it's own extended characters.TBCDCharset: Charset used in SS7 and MAP messages (core telco systems used by GSM systems since 1986). Encodes number sequences plus*and#and the lettersa-c. Has special variant for handling odd number of digits.
IO
LineBufferedReader: Read from a sub-reader, but only buffering one line at a time. Will never read the next line until required.Utf8Stream(Reader|Writer): Read or write characters to a stream using proper UTF-8 encoding, meaning UTF-16 / USC2 combined characters are properly re-encoded to UTF-8 sequences. Also does no caching except for handling the extended unicode chars.IndentedPrintWriter: Write while remembering and applying ongoing indent. Can stack indents so as to generate a properly indented string based on simpler code.