webhelpers.text

Functions that output text (not HTML).

Helpers for filtering, formatting, and transforming strings.

webhelpers.text.chop_at(s, sub, inclusive=False)

Truncate string s at the first occurrence of sub.

If inclusive is true, truncate just after sub rather than at it.

>>> chop_at("plutocratic brats", "rat")
'plutoc'
>>> chop_at("plutocratic brats", "rat", True)
'plutocrat'
webhelpers.text.collapse(string, character=' ')

Removes specified character from the beginning and/or end of the string and then condenses runs of the character within the string.

Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)

webhelpers.text.convert_accented_entities(string)

Converts HTML entities into the respective non-accented letters.

Examples:

>>> convert_accented_entities("á")
'a'
>>> convert_accented_entities("ç")
'c'
>>> convert_accented_entities("è")
'e'
>>> convert_accented_entities("î")
'i'
>>> convert_accented_entities("ø")
'o'
>>> convert_accented_entities("ü")
'u'

Note: This does not do any conversion of Unicode/ASCII accented-characters. For that functionality please use unidecode.

Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)

webhelpers.text.convert_misc_entities(string)

Converts HTML entities (taken from common Textile formattings) into plain text formats

Note: This isn’t an attempt at complete conversion of HTML entities, just those most likely to be generated by Textile.

Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)

webhelpers.text.excerpt(text, phrase, radius=100, excerpt_string='...')

Extract an excerpt from the text, or ‘’ if the phrase isn’t found.

phrase
Phrase to excerpt from text
radius
How many surrounding characters to include
excerpt_string
Characters surrounding entire excerpt

Example:

>>> excerpt("hello my world", "my", 3)
'...lo my wo...'
webhelpers.text.lchop(s, sub)

Chop sub off the front of s if present.

>>> lchop("##This is a comment.##", "##")
'This is a comment.##'

The difference between lchop and s.lstrip is that lchop strips only the exact prefix, while s.lstrip treats the argument as a set of leading characters to delete regardless of order.

webhelpers.text.plural(n, singular, plural, with_number=True)

Return the singular or plural form of a word, according to the number.

If with_number is true (default), the return value will be the number followed by the word. Otherwise the word alone will be returned.

Usage:

>>> plural(2, "ox", "oxen")
'2 oxen'
>>> plural(2, "ox", "oxen", False)
'oxen'
webhelpers.text.rchop(s, sub)

Chop sub off the end of s if present.

>>> rchop("##This is a comment.##", "##")
'##This is a comment.'

The difference between rchop and s.rstrip is that rchop strips only the exact suffix, while s.rstrip treats the argument as a set of trailing characters to delete regardless of order.

webhelpers.text.remove_formatting(string)

Simplify HTML text by removing tags and several kinds of formatting.

If the unidecode package is installed, it will also transliterate non-ASCII Unicode characters to their nearest pronunciation equivalent in ASCII.

Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)

webhelpers.text.replace_whitespace(string, replace=' ')

Replace runs of whitespace in string

Defaults to a single space but any replacement string may be specified as an argument. Examples:

>>> replace_whitespace("Foo       bar")
'Foo bar'
>>> replace_whitespace("Foo       bar", "-")
'Foo-bar'

Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)

webhelpers.text.series(items, conjunction='and', strict_commas=True)

Join strings using commas and a conjunction such as “and” or “or”.

Examples:

>>> series(["A", "B", "C"])
'A, B, and C'
>>> series(["A", "B", "C"], "or")
'A, B, or C'
>>> series(["A", "B", "C"], strict_commas=False)
'A, B and C'
>>> series(["A", "B"])
'A and B'
>>> series(["A"])
'A'
>>> series([])
''
webhelpers.text.strip_leading_whitespace(s)

Strip the leading whitespace in all lines in s.

This deletes all leading whitespace. textwrap.dedent deletes only the whitespace common to all lines.

webhelpers.text.truncate(text, length=30, indicator='...', whole_word=False)

Truncate text with replacement characters.

length
The maximum length of text before replacement
indicator
If text exceeds the length, this string will replace the end of the string
whole_word
If true, shorten the string further to avoid breaking a word in the middle. A word is defined as any string not containing whitespace. If the entire text before the break is a single word, it will have to be broken.

Example:

>>> truncate('Once upon a time in a world far far away', 14)
'Once upon a...'
webhelpers.text.urlify(string)

Create a URI-friendly representation of the string

Can be called manually in order to generate an URI-friendly version of any string.

If the unidecode package is installed, it will also transliterate non-ASCII Unicode characters to their nearest pronounciation equivalent in ASCII.

Examples::
>>> urlify("Mighty Mighty Bosstones")
'mighty-mighty-bosstones'

Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)

Changed in WebHelpers 1.2: urlecode the result in case it contains special characters like ”?”.

webhelpers.text.wrap_paragraphs(text, width=72)

Wrap all paragraphs in a text string to the specified width.

width may be an int or a textwrap.TextWrapper instance. The latter allows you to set other options besides the width, and is more efficient when wrapping many texts.