String Utilities
This module contains utilities around manipulating and working with
strings and readable output, and helping with CLI input and output.
See morimekta.net/utils for procedures on releases.
Getting Started
To add to maven
:
<dependency>
<groupId>net.morimekta.utils</groupId>
<artifactId>strings</artifactId>
<version>4.5.1</version>
</dependency>
To add to gradle
:
implementation 'net.morimekta.utils:strings:4.5.1'
Core Utilities
ConsoleUtil
: Contains a few methods related to the visibility of characters on the console.EscapeUtil
: Escape and unescape strings using the same escape sequence as strings in java code.NamingUtil
: Reformat names using naming rules.ReaderUtil
: Utilities to read or skip content fromReader
.StringUtil
: Get properties of strings and modify strings using extra utilities from the library. Also has util to make consistent string formatting of any object.
And interfaces
Displayable
: Simple interface with adisplayString()
method, and utility methods to make readable strings of standard java utility classes.Stringable
: Simple interface with anasString()
method, and utility methods to make to-string like strings from standard java types.
Character
Char
, and implementationsColor
,Control
,Unicode
: These are wrappers around single keystrokes, control sequences, terminal colors and unicode chars. When handling terminal input, these can represent a keystroke each, or when updating a terminal also control visible colors, move cursor around etc.CharReader
: A reader that can readsChar
objects form an input stream.CharStream
(andCharSplitterator
): Makes a stream of chars from a string or input stream.CharUtil
: Utilities making chars from meaningful input, or bytes from list of chars.CharSlice
: Make an immutable sliced view of a char sequence. Operates as aCharSequence
, but unlike a string, will never copy the underlying data on view operations.
Diff
DiffStringUtil
: Utilities used when handling diffs. Splitting by line intoCharSlice
, and prefix, suffix and overlap comparisons of char sequences.PatchUtil
: Get line-by-line diff of two strings, and make parch strings of a list of changes.
Encoding
GSMCharset
: Charset used in GSM (mobile) encoding. Uses 7 bits per byte in encoding, so can be bit-packed after encoding.T61Charset
: Charset used in TELEX (old terminal control exchange format). Is mostly a subset of ASCII plus it's own extended characters.TBCDCharset
: Charset used in SS7 and MAP messages (core telco systems used by GSM systems since 1986). Encodes number sequences plus*
and#
and the lettersa-c
. Has special variant for handling odd number of digits.
IO
LineBufferedReader
: Read from a sub-reader, but only buffering one line at a time. Will never read the next line until required.Utf8Stream(Reader|Writer)
: Read or write characters to a stream using proper UTF-8 encoding, meaning UTF-16 / USC2 combined characters are properly re-encoded to UTF-8 sequences. Also does no caching except for handling the extended unicode chars.IndentedPrintWriter
: Write while remembering and applying ongoing indent. Can stack indents so as to generate a properly indented string based on simpler code.