| Title: | Support Many Languages in R |
| Version: | 0.1.0 |
| Description: | An object model for source text and translations. Find and extract translatable strings. Provide translations and seamlessly retrieve them at runtime. |
| License: | MIT + file LICENSE |
| URL: | https://transltr.ununoctium.dev |
| BugReports: | https://github.com/jeanmathieupotvin/transltr/issues |
| Encoding: | UTF-8 |
| Language: | en |
| RoxygenNote: | 7.3.2 |
| Config/testthat/edition: | 3 |
| Depends: | R (≥ 4.3) |
| Imports: | digest, R6, stringi, utils, yaml |
| Suggests: | covr, devtools, lifecycle, microbenchmark, pkgdown, testthat (≥ 3.0.0), usethis, withr |
| Collate: | 'aaa.R' 'assert.R' 'class-location.R' 'class-text.R' 'class-translator.R' 'find-source-in-exprs.R' 'find-source.R' 'flat.R' 'hash.R' 'language.R' 'normalize.R' 'serialize.R' 'text-io.R' 'translator-io.R' 'transltr-package.R' 'utils-format-vector.R' 'utils-map.R' 'utils-nullish-op.R' 'utils-stop.R' 'utils-strings.R' 'uuid.R' 'zzz.R' |
| NeedsCompilation: | no |
| Packaged: | 2025-02-14 16:05:36 UTC; jmp |
| Author: | Jean-Mathieu Potvin [aut, cre, cph],
Jérôme Lavoué |
| Maintainer: | Jean-Mathieu Potvin <jeanmathieupotvin@ununoctium.dev> |
| Repository: | CRAN |
| Date/Publication: | 2025-02-14 16:40:02 UTC |
Support Many Languages in R
Description
An object model for source text and translations. Find and extract translatable strings. Provide translations and seamlessly retrieve them at runtime.
Introduction
R relies on GNU gettext to
produce multi-lingual messages (if Native Language Support is enabled).
This is well-designed software offering an extensive set of functionalities.
It is ubiquitous and has withstood the test of time. It is not the objective
of transltr to (fully) replace it.
Package transltr provides an alternative in-memory object
model (and further functions) to easily inspect and manipulate source text
and translations.
It does not change any aspect of the underlying locale.
It has its own data serialization formats for I/O purposes. Source text and translations can be exported to text formats that are sharable and easily modifiable, even by non-technical collaborators.
Its features are extensively documented (even internal ones).
It can always locate and extract translatable strings (no matter where they are in the source code).
Translatable source text is treated as a regular R object.
Getting Started
Write code as you normally would. Whenever a piece of text (literal
character vectors) should be available in multiple languages, pass it
to method Translator$translate(). You may also use your
own function.
Once you are ready to translate your project, call
find_source(). This returns aTranslatorobject.Export the
Translatorobject withtranslator_write(). Fill in the underlying translation files.Import translations back into an R session with
translator_read().
Current language and source language are respectively set with
language_set() and language_source_get(). By default, the latter is set
equal to "en" (English).
Bugs and Feedback
You may submit bugs, request features, and provide feedback by creating an issue on GitHub.
Acknowledgements
Warm thanks to Jérôme Lavoué, who supported and sponsored the first release of this project.
Author(s)
Maintainer: Jean-Mathieu Potvin jeanmathieupotvin@ununoctium.dev [copyright holder]
Other contributors:
Jérôme Lavoué jerome.lavoue@umontreal.ca (ORCID) [contributor, funder, reviewer]
See Also
The scattered and incomplete documentation of R's Native Language Support:
-
tools::xgettext2pot(),tools::update_pkg_po(),tools::checkPoFiles(), -
Section 3 (Internationalization) of R Internals,
-
Section 7 (Internationalization and Localization) of R Installation and Administration, and
-
Section 1.8 (Internationalization) of Writing R Extensions.
The comprehensive technical documentation of
GNU gettext.
Hashing Algorithms
Description
These algorithms map a character string to another character string of hexadecimal characters highly likely to be unique. The latter is used to uniquely identify a source text (and the underlying source language).
Usage
algorithms()
Details
Secure Hash Algorithm 1
Method sha1 corresponds to SHA-1 (Secure Hash Algorithm version 1), a
cryptographic hashing function. While it is now superseded by more secure
variants (SHA-256, SHA-512, etc.), it is still useful for non-sensitive
purposes. It is fast, collision-resistant, and may handle very large inputs.
It emits strings of 40 hexadecimal characters.
Cumulative UTF-8 Sum
This method is experimental. Use with caution.
Method utf8 is a simple method derived from cumulative sums of UTF-8 code
points (converted to integers). It is slightly faster than method sha1 for
small inputs and emits hashes with a width porportional to the underlying
input's length. It is used for testing purposes internally.
Find Source Text
Description
Find and extract source text that must be translated.
Usage
find_source(
path = ".",
encoding = "UTF-8",
verbose = getOption("transltr.verbose", TRUE),
tr = translator(),
interface = NULL
)
find_source_in_files(
paths = character(),
encoding = "UTF-8",
verbose = getOption("transltr.verbose", TRUE),
algorithm = algorithms(),
interface = NULL
)
Arguments
path |
A non-empty and non-NA character string. A path to a directory
containing R source scripts. All subdirectories are searched. Files that
do not have a |
encoding |
A non-empty and non-NA character string. The source character encoding. In almost all cases, this should be UTF-8. Other encodings are internally re-encoded to UTF-8 for portability. |
verbose |
A non-NA logical value. Should progress information be reported? |
tr |
A |
interface |
A |
paths |
A character vector of non-empty and non-NA values. A set of paths to R source scripts that must be searched. |
algorithm |
A non-empty and non-NA character string equal to |
Details
find_source() and find_source_in_files() look for calls to method
Translator$translate() in R scripts and convert them
to Text objects. The former further sets these resulting
objects into a Translator object. See argument tr.
find_source() and find_source_in_files() work on a purely lexical basis.
The source code is parsed but never evaluated (aside from extracted literal
character vectors).
The underlying
Translatorobject is never evaluated and does not need to exist (placeholders may be used in the source code).Only literal character vectors can be passed to arguments of method
Translator$translate().
Interfaces
In some cases, it may not be desirable to call method
Translator$translate() directly. A custom function wrapping
(interfacing) this method may always be used as long as it has the same
signature as method
Translator$translate(). In other words, it must minimally
have two formal arguments: ... and source_lang.
Custom interfaces must be passed to find_source() and
find_source_in_files() for extraction purposes. Since these functions work
on a lexical basis, interfaces can be placeholders in the source code (non-
existent bindings) at the time these functions are called. However, they must
be bound to a function (ultimately) calling Translator$translate()
at runtime.
Custom interfaces are passed to find_source() and find_source_in_files()
as name or call objects in a variety of ways. The most
straightforward way is to use base::quote(). See Examples below.
Methodology
find_source() and find_source_in_files() go through these steps to
extract source text from a single R script.
It is read with
text_read()and re-encoded to UTF-8 if necessary.It is parsed with
parse()and underlying tokens are extracted from parsed expressions withutils::getParseData().Each expression (
expr) token is converted to language objects withstr2lang(). Parsing errors and invalid expressions are silently skipped.Valid
callobjects stemming from step 3 are filtered withis_source().Calls to method
Translator$translate()or tointerfacestemming from step 4 are coerced toTextobjects withas_text().
These steps are repeated for each R script. find_source() further merges
all resulting Text objects into a coherent set with merge_texts()
(identical source code is merged into single Text entities).
Extracted character vectors are always normalized for consistency (at step
5). See normalize() for more information.
Limitations
The current version of transltr can only handle literal
character vectors. This means it cannot resolve non-trivial expressions
that depends on a state. All values passed to argument ... of method
Translator$translate() must yield character vectors
(trivially).
Value
find_source() returns an R6 object of class
Translator. If an existing Translator
object is passed to tr, it is modified in place and returned.
find_source_in_files() returns a list of Text objects. It may
contain duplicated elements, depending on the extracted contents.
See Also
Translator,
Text,
normalize(),
translator_read(),
translator_write(),
base::quote(),
base::call(),
base::as.name()
Examples
# Create a directory containing dummy R scripts for illustration purposes.
temp_dir <- file.path(tempdir(TRUE), "find-source")
temp_files <- file.path(temp_dir, c("ex-script-1.R", "ex-script-2.R"))
dir.create(temp_dir, showWarnings = FALSE, recursive = TRUE)
cat(
"tr$translate('Hello, world!')",
"tr$translate('Farewell, world!')",
sep = "\n",
file = temp_files[[1L]])
cat(
"tr$translate('Hello, world!')",
"tr$translate('Farewell, world!')",
sep = "\n",
file = temp_files[[2L]])
# Extract calls to method Translator$translate().
find_source(temp_dir)
find_source_in_files(temp_files)
# Use custom functions.
# For illustrations purposes, assume the package
# exports an hypothetical translate() function.
cat(
"translate('Hello, world!')",
"transtlr::translate('Farewell, world!')",
sep = "\n",
file = temp_files[[1L]])
cat(
"translate('Hello, world!')",
"transltr::translate('Farewell, world!')",
sep = "\n",
file = temp_files[[2L]])
# Extract calls to translate() and transltr::translate().
# Since find_source() and find_source_in_files() work on
# a lexical basis, these are always considered to be two
# distinct functions. They also don't need to exist in the
# R session calling find_source() and find_source_in_files().
find_source(temp_dir, interface = quote(translate))
find_source_in_files(temp_files, interface = quote(transltr::translate))
Find Source Text in Expressions
Description
Find and extract source text that must be translated from a single file
or a set of R expr tokens.
Arguments listed below are not explicitly validated for efficiency.
Usage
find_source_in_file(
path = "",
encoding = "UTF-8",
verbose = getOption("transltr.verbose", TRUE),
algorithm = algorithms(),
interface = NULL
)
find_source_in_exprs(
tokens = utils::getParseData(),
path = "",
algorithm = algorithms(),
interface = NULL
)
find_source_exprs(path = "", encoding = "UTF-8")
is_source(x, interface = NULL)
Arguments
path |
A non-empty and non-NA character string. A path to an R source script. |
encoding |
A non-empty and non-NA character string. The source character encoding. In almost all cases, this should be UTF-8. Other encodings are internally re-encoded to UTF-8 for portability. |
verbose |
A non-NA logical value. Should progress information be reported? |
algorithm |
A non-empty and non-NA character string equal to |
interface |
A |
tokens |
A |
x |
Any R object. |
Details
find_source_in_exprs() silently skips parsing errors. See find_source()
for more information.
is_source() checks if an object conceptually represents a source text.
This can either be
a
callto methodTranslator$translate()ora
callto a custom function referenced byinterface.
Calls to method Translator$translate() that include
... in their argument(s) are ignored. Such calls are part
of the definition of a custom interface and should not be extracted.
Value
find_source_in_file() and find_source_in_exprs() return a list of
Text objects. It may contain duplicated elements, depending
on the extracted contents.
find_source_exprs() returns a subset of the output of
utils::getParseData(). Only expr tokens are returned.
is_source() returns a logical value.
See Also
Text,
find_source(),
utils::getParseData()
Serialize Objects to Flat Strings
Description
Serialize R objects into textual sequences of unindented (flat) and identifiable sections. These are called FLAT (1.0) objects.
Usage
flat_serialize(x = list(), tag_sep = ": ", tag_empty = "")
flat_deserialize(string = "", tag_sep = ": ")
flat_tag(x = list(), tag_sep = ": ", tag_empty = "")
flat_format(x = list())
flat_example()
Arguments
x |
A list. It can be empty. |
tag_sep |
A non-empty and non-NA character string. The separator to use
when creating tags from names (recursively) extracted from |
tag_empty |
A non-NA character string. The value to use as a substitute for empty names. Positional indices are automatically appended to it to ensure tags are always unique. |
string |
A non-NA character string. It can be empty. Contents to deserialize. |
Details
The Flat format (Flat List As Text, or FLAT) is a minimal
textual data serialization format optimized for R list objects.
Elements are converted to character strings and organized into unindented
sections identified by a tag. Call flat_example() for a valid example.
flat_serialize() serializes x into a FLAT object.
flat_deserialize() is the inverse operation: it converts a FLAT object
back into a list. The latter has the same shape as the original one, but
-
atomic vectors are not reconstituted (they are deserialized as elements of length 1), and
all elements are also left as character strings.
The convention is to serialize an empty list to an empty character string.
Internal mechanisms
flat_tag() and flat_format() are called internally by flat_serialize().
Aside from debugging purposes, they should not be called outside of the
former.
flat_tag() creates tags from names extracted from x and formats them.
Tags may not be unique, depending on x's structure and names.
flat_format() recursively formats the elements of x as part of the
serialization process. It
converts
NULLto the"NULL"character string,converts other elements to character strings using
format()andreplaces empty lists by a
<empty list>constant treated as a placeholder.
Value
flat_serialize() returns a character string, possibly empty.
flat_deserialize() returns a named list, possibly empty. Its structure
depends on the underlying tags.
flat_tag() returns a character vector, possibly empty.
flat_format() returns an unnamed list having the same shape as x. See
Details.
flat_example() returns a character string (a serialized example),
invisibly. It is used for its side-effect of printing an illustration of
the format (with useful information).
Format Vectors
Description
Format atomic vectors, lists, and pairlists.
Usage
format_vector(
x = vector(),
label = NULL,
level = 0L,
indent = 1L,
fill_names = FALSE,
null = "<null>",
empty = "<empty>",
validate = TRUE
)
Arguments
x |
A vector of any atomic mode, a list, or a pairlist. It can be empty and it can contain NA values. |
label |
A |
level |
A non-NA integer value. The current depth, or current nesting level to use for indentation purposes. |
indent |
A non-NA integer value. The number of single space(s) to use
for each |
fill_names |
A non-NA logical value. Should |
null |
A non-empty and non-NA character string. The value to use to
represent |
empty |
A non-empty and non-NA character string. The value to use to
represent empty vectors, excluding |
validate |
A non-NA logical value. Should the arguments be validated before being used? This argument should be left as is. |
Details
format_vector() is an alternative to utils::str() that exposes a much
simpler generic formatting interface and yields terser outputs of name/value
pairs. Indentation is used for nested values.
format_vector() does not attempt to cover all R objects like
utils::str(). Instead, it (merely) focuses on efficiently handling the
types used by transltr. It is the low-level workhorse function of
format.Translator(), format.Text(), and format.Location().
Value
A character vector, possibly trimmed by str_trim().
See Also
Other utility functions:
stops(),
str_to(),
vapply_1l()
Hashing
Description
Map an arbitrary character string to a shorter string of hexadecimal characters highly likely to be unique. It typically has a fixed width.
Arguments listed below are not validated for efficiency.
Usage
hash(lang = "", text = "", algorithm = "")
Arguments
lang |
A non-empty and non-NA character string. The underlying language. A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility. |
text |
A non-NA character string. It can be empty. |
algorithm |
A non-empty and non-NA character string equal to |
Details
Hashes generated by hash() uniquely identify the lang and text pair.
Values passed to these arguments are concatenated with a colon character for
hashing purposes.
Value
hash() returns a character string, or NULL if algorithm is not
supported.
See Also
Translator,
Text,
normalize(),
algorithms()
Assertions
Description
These functions are a functional implementation of defensive programming.
is_*() functions check whether their argument meets certain criteria.
assert_*() functions further throw an error message when at least one
criterion is not met.
Arguments listed below are not explicitly validated for efficiency.
Usage
is_int(x, allow_empty = FALSE)
is_chr(x, allow_empty = FALSE)
is_lgl1(x)
is_int1(x)
is_chr1(x, allow_empty_string = FALSE)
is_list(x, allow_empty = FALSE)
is_between(x, min = -Inf, max = Inf)
is_named(x, allow_empty_names = FALSE, allow_na_names = FALSE)
is_match(x, choices = vector(), allow_partial = FALSE)
assert_int(
x,
allow_empty = FALSE,
throw_error = TRUE,
x_name = deparse(substitute(x))
)
assert_chr(
x,
allow_empty = FALSE,
throw_error = TRUE,
x_name = deparse(substitute(x))
)
assert_lgl1(x, throw_error = TRUE, x_name = deparse(substitute(x)))
assert_int1(x, throw_error = TRUE, x_name = deparse(substitute(x)))
assert_chr1(
x,
allow_empty_string = FALSE,
throw_error = TRUE,
x_name = deparse(substitute(x))
)
assert_list(
x,
allow_empty = FALSE,
throw_error = TRUE,
x_name = deparse(substitute(x))
)
assert_between(
x,
min = -Inf,
max = Inf,
throw_error = TRUE,
x_name = deparse(substitute(x))
)
assert_named(
x,
allow_empty_names = FALSE,
allow_na_names = FALSE,
throw_error = TRUE,
x_name = deparse(substitute(x))
)
assert_match(
x,
choices,
allow_partial = FALSE,
quote_values = FALSE,
throw_error = TRUE,
x_name = deparse(substitute(x))
)
assert_arg(x, quote_values = FALSE, throw_error = TRUE)
assert(x, ...)
## Default S3 method:
assert(x, ...)
Arguments
x |
Any R object. |
allow_empty |
A non-NA logical value. Should vectors of length 0 be considered as valid values? |
allow_empty_string |
A non-NA logical value. Should empty character strings be considered as valid values? |
min |
A non-NA numeric lower bound. It can be infinite. |
max |
A non-NA numeric upper bound. It can be infinite. |
allow_empty_names |
A non-NA logical value. Should empty character strings be considered as valid names? This is different from having no names at all. |
allow_na_names |
A non-NA logical value. Should NA values be considered as valid names? |
choices |
A non-empty vector of valid candidates. |
allow_partial |
A non-NA logical value. Should |
throw_error |
A non-NA logical value. Should an error be thrown? If so,
|
x_name |
A non-empty and non-NA character string. The name of |
quote_values |
A non-NA logical value. Passed as is to |
Details
Guard clauses tend to be verbose and recycled many times within a project.
This makes it hard to keep error messages consistent over time. assert_*()
functions encapsulate usual guard clause into simple semantic functions.
This reduces code repetition and number of required unit tests. See
Examples below.
By convention, NA values are always disallowed.
assert_arg() is a partial refactoring of base::match.arg(). It relies
on assert_match() internally and does not have an equivalent is_arg()
function. It must be called within another function.
assert() is a S3 generic function that covers specific data structures.
Classes (and underlying objects) that do not have an assert() method are
considered to be valid by default.
Value
is_int(),
is_chr(),
is_lgl1(),
is_int1(),
is_chr1(),
is_list(),
is_between(),
is_named(), and
is_match() return a logical value.
assert(),
assert_int(),
assert_chr(),
assert_lgl1(),
assert_int1(),
assert_chr1(),
assert_list(),
assert_between(),
assert_named(),
assert_match(), and
assert_arg() return an empty character vector if x meets the underlying
criteria and throw an error otherwise. If throw_error is FALSE, the
error message is returned as a character vector. Unless otherwise stated,
the latter is of length 1 (a character string).
assert.default() always returns an empty character vector.
Get or Set Language
Description
Get or set the current, and source languages.
They are registered as environment variables named
TRANSLTR_LANGUAGE, and TRANSLTR_SOURCE_LANGUAGE.
Usage
language_set(lang = "en")
language_get()
language_source_set(lang = "en")
language_source_get()
Arguments
lang |
A non-empty and non-NA character string. The underlying language. A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility. |
Details
The language and the source language can always be temporarily changed. See
argument lang of method Translator$translate() for more
information.
The underlying locale is left as is. To change an R session's locale,
use Sys.setlocale() or Sys.setLanguage() instead. See below for more
information.
Value
language_set(), and language_source_set() return NULL, invisibly. They
are used for their side-effect of setting environment variables
TRANSLTR_LANGUAGE and TRANSLTR_SOURCE_LANGUAGE, respectively.
language_get() returns a character string. It is the current value of
environment variable TRANSLTR_LANGUAGE. It is empty if the latter is
unset.
language_source_get() returns a character string. It is the current value
of environment variable TRANSLTR_SOURCE_LANGUAGE. It returns "en" if the
latter is unset.
Locales versus languages
A locale is a set of multiple low-level settings that relate to the user's language and region. The language itself is just one parameter among many others.
Modifying a locale on-the-fly can be considered risky in some situations. It may not be the optimal solution for merely changing textual representations of a program or an application at runtime, as it may introduce unintended changes and induce subtle bugs that are harder to fix.
Moreover, it makes sense for some applications and/or programs such as Shiny applications to decouple the front-end's current language (what users see) from the back-end's locale (what developers see). A UI may be displayed in a certain language while keeping logs and R internal messages, warnings, and errors as is.
Consequently, the language setting of transltr is purposely
kept separate from the underlying locale and removes the complexity of
having to support many of them. Users can always change both the locale and
the language parameter of the package. See Examples.
Note
Environment variables are used because they can be shared among different
processes. This matters when using parallel and/or concurrent R sessions.
It can further be shared among direct and transitive dependencies (other
packages that rely on transltr).
Examples
# Change the language parameters (globally).
language_source_set("en")
language_set("fr")
language_source_get() ## Outputs "en"
language_get() ## Outputs "fr"
# Change both the language parameter and the locale.
# Note that while users control how languages are named
# for language_set(), they do not for Sys.setLanguage().
language_set("fr")
Sys.setLanguage("fr-CA")
# Reset settings.
language_source_set(NULL)
language_set(NULL)
# Source language has a default value.
language_source_get() ## Outputs "en"
Source Locations
Description
Structure and manipulate source locations. Class Location is
a lighter alternative to srcfile() and other related functionalities.
Usage
location(path = tempfile(), line1 = 1L, col1 = 1L, line2 = 1L, col2 = 1L)
is_location(x)
## S3 method for class 'Location'
format(x, ...)
## S3 method for class 'Location'
print(x, ...)
## S3 method for class 'Location'
c(...)
merge_locations(...)
Arguments
path |
A non-empty and non-NA character string. The origin of the ranges. |
line1, col1 |
A non-empty integer vector of non-NA values. The (inclusive) starting point(s) of what is being referenced. |
line2, col2 |
A non-empty integer vector of non-NA values. The (inclusive) end(s) of what is being referenced. |
x |
Any R object. |
... |
Usage depends on the underlying function.
|
Details
A Location is a set of one or more line/column ranges
referencing contents (like text or source code) within a common origin
identified by an underlying path. The latter is generic and can be
anything: a file on disk, on a network, a pointer, a binding, etc. What
matters is the underlying context.
Location objects may refer to multiple distinct ranges for
the the same origin. This is why arguments line1, col1, line2 and
col2 accept integer vectors (and not only scalar values).
Combining Location Objects
c() can only combine Location objects having the same
path. In that case, the underlying ranges are combined into a set of
non-duplicated range(s).
merge_locations() is a generalized version of c() that handles any
number of Location objects having possibly different paths.
It can be viewed as a vectorized version of c().
Value
location(), and c() return a named list of length 5 and of S3 class
Location containing the values of path, line1, col1,
line2, and col2.
is_location() returns a logical value.
format() returns a character vector.
print() returns argument x invisibly.
merge_locations() returns a list of (combined) Location
objects.
Examples
# Create Location objects.
loc1 <- location("file-a", 1L, 2L, 3L, 4L)
loc2 <- location("file-a", 5L, 6L, 7L, 8L)
loc3 <- location("file-c", c(9L, 10L), c(11L, 12L), c(13L, 14L), c(15L, 16L))
is_location(loc1) ## TRUE
print(loc1)
print(loc2)
print(loc3)
# Combine Location objects.
# c() throws an error if they do not have the same path.
c(loc1, loc2)
# Location objects with different paths can be merged.
# This groups Location objects according to their paths
# and calls c() on each group. It returns a list.
merge_locations(loc1, loc2, loc3)
# The path of a Location object can be whatever fits the context.
# Below is an example that references text in a character vector
# bound to variable x in the global environment.
x <- "This is a string and it is held in memory for some purpose."
location("<environment: R_GlobalEnv: x>", 1L, 11L, 1L, 16L)
Normalize Text
Description
Construct a standardized string from values passed to ...
Usage
normalize(...)
Arguments
... |
Any number of vectors containing atomic elements. Each vector is normalized as a paragraph.
|
Details
Input text can written in a variety of ways using single-line and multi-line
strings. Values passed to ... are normalized (to ensure their consistency)
and collapsed to a single character string using the standard paragraph
separator. The latter is defined as two newline characters ("\n\n").
NA values and empty strings are discarded before reducing
...to a character string.Whitespaces (tabs, newlines, and repeated spaces) characters are replaced by a single space. Paragraph separators are preserved.
Leading or trailing whitespaces are stripped.
Value
A character string, possibly empty.
Source Ranges
Description
Create, parse, and validate source ranges.
Usage
range_format(x = location())
range_parse(ranges = character())
range_is_parseable(ranges = character())
Arguments
x |
A |
ranges |
A character vector of non-NA and non-empty values. The ranges to extract pairs of indices (line, column) from. |
Details
Ranges are Ln <int>, Col <int> @ Ln <int>, Col <int> strings created on-the-fly from
Location objects for outputting purposes.
Value
range_format() returns a character vector. It assumes that x is valid.
range_parse() returns a list having the same length as ranges. Each
element is an integer vectors containing 4 non-NA values (unless the
underlying range is invalid).
range_is_parseable() returns a logical vector having the same length as
ranges.
See Also
Serialize Objects
Description
Convert Translator objects, Text objects, and
Location objects to a YAML object, or
vice-versa.
Convert translations contained by a Translator object to
a custom textual representation (a FLAT object), or
vive-versa.
Usage
serialize(x, ...)
serialize_translations(tr = translator(), lang = "")
deserialize(string = "")
deserialize_translations(string = "", tr = NULL)
export_translations(tr = translator(), lang = "")
export(x, ...)
## S3 method for class 'Translator'
export(x, ...)
## S3 method for class 'Text'
export(x, id = uuid(), set_translations = FALSE, ...)
## S3 method for class 'Location'
export(x, id = uuid(), ...)
## S3 method for class 'ExportedTranslator'
assert(x, throw_error = TRUE, ...)
## S3 method for class 'ExportedText'
assert(x, throw_error = TRUE, ...)
## S3 method for class 'ExportedLocation'
assert(x, throw_error = TRUE, ...)
## S3 method for class 'ExportedTranslations'
assert(x, throw_error = TRUE, ...)
import(x, ...)
## S3 method for class 'ExportedTranslator'
import(x, ...)
## S3 method for class 'ExportedText'
import(x, ...)
## S3 method for class 'ExportedLocation'
import(x, ...)
## S3 method for class 'ExportedTranslations'
import(x, tr = NULL, ...)
## Default S3 method:
import(x, ...)
format_errors(errors = character(), id = uuid(), throw_error = TRUE)
Arguments
x |
Any R object. |
... |
Further arguments passed to, or from other methods. |
tr |
A This argument is |
lang |
A non-empty and non-NA character string. The underlying language. A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility. |
string |
A non-empty and non-NA character string. Contents to deserialize. |
id |
A non-empty and non-NA character string. A unique identifier for the underlying object. It is used for validation purposes. |
set_translations |
A non-NA logical value. Should translations be
included in the resulting |
throw_error |
A non-NA logical value. Should an error be thrown? If so,
|
errors |
A non-empty character vector of non-NA values. Error message(s) describing why object(s) are invalid. |
Details
The information contained within a Translator object is
split by default. Unless set_translations is TRUE, translations are
serialized independently from other fields. This is useful when creating
Translator files and translations files.
While serialize() and serialize_translations() are distinct, they share
a common design and perform the same thing, at least conceptually. The
same is true for deserialize() and deserialize_translations(). These 4
functions are those that should be used in almost all circumstances.
Serialization
The data serialization process performed by serialize() and
serialize_translations() is internally broken down into 2 steps: objects
are first exported before being serialized.
export() and export_translations() are preserializing mechanisms that
convert objects into transient objects that ease the conversion process.
They are never returned to the user: serialize(), and
serialize_translations() immediately transform them into character strings.
serialize() returns a YAML object.
serialize_translations() returns a FLAT object.
Deserialization
The data deserialization process performed by deserialize() and
deserialize_translations() is internally broken down into 3 steps: objects
are first deserialized, then validated and finally, imported.
deserialize() and deserialize_translations() are
raw deserializer mechanisms: string is converted into an R named list
that is presumed to be an exported object. deserialize() relies on
YAML tags to infer the class of each
object.
The contents of the transient objects is thoroughly checked with an
assert() method (based on the underlying presumed class). Valid objects
are imported back into an appropriate R object with import().
Custom fields and comments added by users to serialized objects are ignored.
Formatting errors
assert() methods accumulate error messages before returning, or throwing
them. format_errors() is a helper function that eases this process. It
exists to avoid repeting code in each method. There is no reason to call
it outside of assert() methods.
Value
See other sections for further information.
serialize(), and serialize_translations() return a character string.
export() returns a named list of S3 class
-
ExportedTranslatorifxis aTranslatorobject, -
ExportedTextifxis aTextobject, or -
ExportedLocationifxis aLocationobject.
export_translations() returns an ExportedTranslations object.
deserialize() and import() return
a
Translatorobject ifxis a validExportedTranslatorobject,a
Textobject ifxis a validExportedTextobject, ora
Locationobject ifxa validExportedLocationobject.
deserialize_translations() and import.ExportedTranslations() return an
ExportedTranslations object. They further register imported
translations if a Translator object is passed to tr.
Translations must correspond to an existing source text (a registered
Textobject). Otherwise, they are skipped.The value passed to
tris updated by reference and is not returned.
import.default() is used for its side-effect of throwing an error for
unsupported objects.
assert.ExportedTranslator(),
assert.ExportedText(),
assert.ExportedLocation(), and
assert.ExportedTranslations() return a character vector, possibly empty.
If throw_error is TRUE, an error is thrown if an object is invalid.
format_errors() returns a character vector, and outputs its contents as
an error if throw_error is TRUE.
Exported Objects
An exported object is a named list of S3 class
ExportedTranslator,
ExportedText,
ExportedLocation, or
ExportedTranslations and
always having a tag attribute whose value is equal to the super-class of
x.
There are four main differences between an object and its exported counterpart.
Field names are slightly more verbose.
Source text is treated independently from translations.
Unset fields are set equal to
NULL(a~in YAML).Each object has an
Identifierused to locate errors.
The correspondance between objects is self-explanatory.
See class
Translatorfor more information on classExportedTranslator.See class
Textfor more information on classExportedText.See class
Locationfor more information on classExportedLocation.
You may also explore provided examples below.
The ExportedTranslations Class
ExportedTranslations objects are created from a
Translator object with export_translations(). Their purpose
is to restructure translations by language. They are different from other
exported objects because there is no corresponding Translations class.
An ExportedTranslations object is a named list of S3 class
ExportedTranslations containing the following elements.
IdentifierThe unique identifier of argument
tr. SeeTranslator$idfor more information.Language CodeThe value of argument
lang.LanguageThe translation's language. See
Translator$native_languagesfor more information.Source LanguageThe source text's language. See
Translator$source_langsfor more information.TranslationsA named list containing further named lists. Each sublist contains two values:
Source TextA non-empty and non-NA character string.
TranslationA non-empty and non-NA character string.
See
Text$translationsfor more information.
Unavailable translations are automatically replaced by a placeholder that depends on whether they are exported or imported.
Note
Dividing the serialization and deserialization processes into multiple steps helps keeping the underlying functions short, and easier to test.
See Also
Official YAML 1.1 specification,
yaml::as.yaml(),
yaml::yaml.load(),
flat_serialize(),
flat_deserialize(),
translator_read(),
translator_write(),
translations_read(),
translations_write()
Throw Errors
Description
stops() is equivalent to stop(..., call. = FALSE). It removes calls
from error messages by default. These are rarely useful and confuse users
more often than they help them.
stopf() is equivalent to stops(sprintf(fmt, ...)). It wraps
base::sprintf() and stops() and is used to construct flexible
error messages.
Usage
stops(...)
stopf(fmt = "", ...)
Arguments
... |
Further arguments respectively passed to |
fmt |
A character of length 1 passed as is to |
Value
Nothing. These functions are used for their side-effect of raising an error.
See Also
Other utility functions:
format_vector(),
str_to(),
vapply_1l()
Character String Utilities
Description
str_to() converts an R object to a character string. It is a slightly
more flexible alternative to base::toString().
str_trim() wraps base::strtrim() and further adds a ... suffix to
each trimmed element.
str_wrap() wraps base::strwrap() and ensures a character string is
returned.
Usage
str_to(x, ...)
## Default S3 method:
str_to(x, quote_values = FALSE, last_sep = ", or ", ...)
str_trim(x = character(), width = 80L)
str_wrap(x = character(), width = 80L)
Arguments
x |
Any R object for |
... |
Further arguments passed to, or from other methods. |
quote_values |
A non-NA logical value. Should elements of |
last_sep |
A non-empty and non-NA character string separating the last and penultimate elements. |
width |
A non-NA integer value. The target width for individual
elements of |
Details
str_to() concatenates all elements with ", ", except for the last
one. See argument last_sep.
str_wrap() preserves existing paragraph separators ("\n\n").
Value
str_to() and str_wrap() return a character string.
str_trim() returns a character vector having the same length as x.
See Also
Other utility functions:
format_vector(),
stops(),
vapply_1l()
Source Text
Description
Structure source text and its translations.
Usage
text(..., source_lang = language_source_get(), algorithm = algorithms())
is_text(x)
## S3 method for class 'Text'
format(x, ...)
## S3 method for class 'Text'
print(x, ...)
## S3 method for class 'Text'
c(...)
merge_texts(..., algorithm = algorithms())
as_text(x, ...)
## S3 method for class 'call'
as_text(x, loc = location(), algorithm = algorithms(), ...)
Arguments
... |
Usage depends on the underlying function. |
source_lang |
A non-empty and non-NA character string. The language of the source text. A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1. Doing so maximizes portability and cross-compatibility between packages. |
algorithm |
A non-empty and non-NA character string equal to |
x |
Any R object. |
loc |
A |
Details
A Text object is a piece of source text that is extracted from R
source scripts.
It (typically) has one or more
Locationswithin a project.It has zero or more translations.
The Text class structures this information and exposes a set of
methods to manipulate it.
Combining Text Objects
c() can only combine Text objects having the same hash.
This is equivalent to having the same algorithm, source_lang, and
source_text. In that case, the underlying translations and
Location objects are combined and a new object is returned.
It throws an error if all Text objects are empty (they have no
set source_lang).
merge_texts() is a generalized version of c() that handles any number
of Text objects having possibly different hashes. It can be
viewed as a vectorized version of c(). It silently ignores and drops
all empty Text objects.
Coercion
as_text() is an S3 generic function that attempts to coerce its argument
into a suitable Text object. as_text.call() is the method used
by find_source() to coerce a call object to a Text
object. While it can be used, it should be avoided most of the time. Users
may extend it by defining their own methods.
Value
text(), c(), and as_text() return an R6 object of
class Text.
is_text() returns a logical value.
format() returns a character vector.
print() returns argument x invisibly.
merge_texts() returns a list of (combined) Text objects. It
can be empty if all underlying Text objects are empty.
Active bindings
hashA non-empty and non-NA character string. A reproducible hash generated from
source_langandsource_text, and by using the algorithm specified byalgorithm. It is used as a unique identifier for the underlyingTextobject.This is a read-only field. It is automatically updated whenever fields
source_langand/oralgorithmare updated.algorithmA non-empty and non-NA character string equal to
"sha1", or"utf8". The algorithm to use when hashing source information for identification purposes.source_langA non-empty and non-NA character string. The language of the source text.
A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1. Doing so maximizes portability and cross-compatibility between packages.
source_textA non-empty and non-NA character string. The source text. This is a read-only field.
languagesA character vector. Registered language codes. This is a read-only field. Use methods below to update it.
translationsA named character vector. Registered translations of
source_text, including the latter. Names correspond tolanguages. This is a read-only field. Use methods below to update it.locationsA list of
Locationobjects giving the location(s) ofsource_textin the underlying project. It can be empty. This is a read-only field. Use methods below to update it.
Methods
Public methods
Method new()
Create a Text object.
Usage
Text$new(algorithm = algorithms())
Arguments
algorithmA non-empty and non-NA character string equal to
"sha1", or"utf8". The algorithm to use when hashing source information for identification purposes.
Returns
Examples
# Consider using text() instead. txt <- Text$new()
Method get_translation()
Extract a translation, or the source text.
Usage
Text$get_translation(lang = "")
Arguments
langA non-empty and non-NA character string. The underlying language.
A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility.
Returns
A character string. NULL is returned if the requested
translation is not available.
Examples
txt <- Text$new()
txt$set_translation("en", "Hello, world!")
txt$get_translation("en") ## Outputs "Hello, world!"
txt$get_translation("fr") ## Outputs NULL
Method set_translation()
Register a translation, or the source text.
Usage
Text$set_translation(lang = "", text = "")
Arguments
langA non-empty and non-NA character string. The underlying language.
A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility.
textA non-empty and non-NA character string. A translation, or the source text.
Details
This method is also used to register source_lang and
source_text before setting them as such. See Examples below.
Returns
A NULL, invisibly.
Examples
# Register a pair of source_lang and source_text.
txt <- Text$new()
txt$set_translation("en", "Hello, world!")
txt$source_lang <- "en"
Method set_translations()
Register one or more translations, and/or the source text.
Usage
Text$set_translations(...)
Arguments
...Any number of named, non-empty, and non-NA character strings.
Details
This method can be viewed as a vectorized version of
method set_translation().
Returns
A NULL, invisibly.
Examples
txt <- Text$new() txt$set_translations(en = "Hello, world!", fr = "Bonjour, monde!")
Method set_locations()
Register one or more locations.
Usage
Text$set_locations(...)
Arguments
...Any number of
Locationobjects.
Details
This method calls merge_locations() to merge all
values passed to ... together with previously registered
Location objects. The underlying registered
paths and/or ranges won't be duplicated.
Returns
A NULL, invisibly.
Examples
txt <- Text$new()
txt$set_locations(
location("a", 1L, 2L, 3L, 4L),
location("a", 1L, 2L, 3L, 4L),
location("b", 5L, 6L, 7L, 8L))
Method rm_translation()
Remove a registered translation.
Usage
Text$rm_translation(lang = "")
Arguments
langA non-empty and non-NA character string identifying a translation to be removed.
Details
You cannot remove lang when it is registered as the
current source_lang. You must update source_lang before
doing so.
Returns
A NULL, invisibly.
Examples
txt <- Text$new()
txt$set_translations(en = "Hello, world!", fr = "Bonjour, monde!")
txt$source_lang <- "en"
# Remove source_lang and source_text.
txt$source_lang <- "fr"
txt$rm_translation("en")
Method rm_location()
Remove a registered location.
Usage
Text$rm_location(path = "")
Arguments
pathA non-empty and non-NA character string identifying a
Locationobject to be removed.
Returns
A NULL, invisibly.
Examples
txt <- Text$new()
txt$set_locations(
location("a", 1L, 2L, 3L, 4L),
location("b", 5L, 6L, 7L, 8L))
txt$rm_location("a")
Examples
# Set source language.
language_source_set("en")
# Create Text objects.
txt1 <- text(
location("a", 1L, 2L, 3L, 4L),
location("a", 1L, 2L, 3L, 4L),
location("b", 5L, 6L, 7L, 8L),
location("c", c(9L, 10L), c(11L, 12L), c(13L, 14L), c(15L, 16L)),
en = "Hello, world!",
fr = "Bonjour, monde!",
es = "¡Hola, mundo!")
txt2 <- text(
location("a", 1L, 2L, 3L, 4L),
en = "Hello, world!",
fr = "Bonjour, monde!",
es = "¡Hola, mundo!")
txt3 <- text(
source_lang = "fr2",
location("a", 5L, 6L, 7L, 8L),
en = "Hello, world!",
fr2 = "Bonjour le monde!",
es = "¡Hola, mundo!")
is_text(txt1)
# Texts objects has a specific format.
# print() calls format() internally, as expected.
print(txt1)
print(txt2)
print(txt3)
# Combine Texts objects.
# c() throws an error if they do not have the same
# hash (same souce_text, source_lang, and algorithm).
c(txt1, txt2)
# Text objects with different hashes can be merged.
# This groups Text objects according to their hashes
# and calls c() on each group. It returns a list.
merge_texts(txt1, txt2, txt3)
# Objects can be coerced to a Text object with as_text(). Below is an
# example for call objects. This is for illustration purposes only,
# and the latter should not be used. This method is used internally by
# find_source().
cl <- str2lang("translate('Hello, world!')")
loc <- location("example in class-text", 2L, 32L, 2L, 68L)
as_text(cl, loc)
## ------------------------------------------------
## Method `Text$new`
## ------------------------------------------------
# Consider using text() instead.
txt <- Text$new()
## ------------------------------------------------
## Method `Text$get_translation`
## ------------------------------------------------
txt <- Text$new()
txt$set_translation("en", "Hello, world!")
txt$get_translation("en") ## Outputs "Hello, world!"
txt$get_translation("fr") ## Outputs NULL
## ------------------------------------------------
## Method `Text$set_translation`
## ------------------------------------------------
# Register a pair of source_lang and source_text.
txt <- Text$new()
txt$set_translation("en", "Hello, world!")
txt$source_lang <- "en"
## ------------------------------------------------
## Method `Text$set_translations`
## ------------------------------------------------
txt <- Text$new()
txt$set_translations(en = "Hello, world!", fr = "Bonjour, monde!")
## ------------------------------------------------
## Method `Text$set_locations`
## ------------------------------------------------
txt <- Text$new()
txt$set_locations(
location("a", 1L, 2L, 3L, 4L),
location("a", 1L, 2L, 3L, 4L),
location("b", 5L, 6L, 7L, 8L))
## ------------------------------------------------
## Method `Text$rm_translation`
## ------------------------------------------------
txt <- Text$new()
txt$set_translations(en = "Hello, world!", fr = "Bonjour, monde!")
txt$source_lang <- "en"
# Remove source_lang and source_text.
txt$source_lang <- "fr"
txt$rm_translation("en")
## ------------------------------------------------
## Method `Text$rm_location`
## ------------------------------------------------
txt <- Text$new()
txt$set_locations(
location("a", 1L, 2L, 3L, 4L),
location("b", 5L, 6L, 7L, 8L))
txt$rm_location("a")
Read and Write Text
Description
text_read() and text_write() respectively wrap base::readLines() and
base::writeLines(). They further validate their arguments, normalize
file paths and re-encode inputs to UTF-8 before reading and writing.
Usage
text_read(path = "", encoding = "UTF-8")
text_write(x = character(), path = "", encoding = "UTF-8")
Arguments
path |
A non-empty and non-NA character string. A path to a file to read text from, or write text to. |
encoding |
A non-empty and non-NA character string. The source character encoding. In almost all cases, this should be UTF-8. Other encodings are internally re-encoded to UTF-8 for portability. |
x |
A character vector. Lines of text to write. Its current encoding is
given by |
Value
text_read() returns a character vector.
text_write() returns NULL, invisibly.
See Also
readLines(),
writeLines(),
iconv()
Source Text and Translations
Description
Structure and manipulate the source text of a project and its translations.
Usage
translator(..., id = uuid(), algorithm = algorithms())
is_translator(x)
## S3 method for class 'Translator'
format(x, ...)
## S3 method for class 'Translator'
print(x, ...)
Arguments
... |
Usage depends on the underlying function.
|
id |
A non-empty and non-NA character string. A globally unique
identifier for the |
algorithm |
A non-empty and non-NA character string equal to |
x |
Any R object. |
Details
A Translator object encapsulates the source text of a project
(or any other context) and all related translations. Under the hood,
Translator objects are collections of Text objects.
These do most of the work. They are treated as lower-level component and in
typical situations, users rarely interact with them.
Translator objects can be saved and exported with
translator_write(). They can be imported back into an R session
with translator_read().
Value
translator() returns an R6 object of class
Translator.
is_translator() returns a logical value.
format() returns a character vector.
print() returns argument x invisibly.
Active bindings
idA non-empty and non-NA character string. A globally unique identifier for the underlying object. Beware of plausible collisions when using user-defined values.
algorithmA non-empty and non-NA character string equal to
"sha1", or"utf8". The algorithm to use when hashing source information for identification purposes.hashesA character vector of non-empty and non-NA values, or
NULL. The set of allhashexposed by registeredTextobjects. If there is none,hashesisNULL. This is a read-only field updated whenever fieldalgorithmis updated.source_textsA character vector of non-empty and non-NA values, or
NULL. The set of all registered source texts. If there is none,source_textsisNULL. This is a read-only field.source_langsA character vector of non-empty and non-NA values, or
NULL. The set of all registered source languages. This is a read-only field.If there is none,
source_langsisNULL.If there is one unique value,
source_langsis an unnamed character string.Otherwise, it is a named character vector.
languagesA character vector of non-empty and non-NA values, or
NULL. The set of all registeredlanguages(codes). If there is none,languagesisNULL. This is a read-only field.native_languagesA named character vector of non-empty and non-NA values, or
NULL. A map (bijection) oflanguages(codes) to native language names. Names are codes and values are native languages. If there is none,native_languagesisNULL.While users retain full control over
native_languages, it is best to use well-known schemes such as IETF BCP 47, or ISO 639-1. Doing so maximizes portability and cross-compatibility between packages.Update this field with method
$set_native_languages(). See below for more information.
Methods
Public methods
Method new()
Create a Translator object.
Usage
Translator$new(id = uuid(), algorithm = algorithms())
Arguments
idA non-empty and non-NA character string. A globally unique identifier for the
Translatorobject. Beware of collisions when using user-defined values.algorithmA non-empty and non-NA character string equal to
"sha1", or"utf8". The algorithm to use when hashing source information for identification purposes.
Returns
An R6 object of class Translator.
Examples
# Consider using translator() instead. tr <- Translator$new()
Method translate()
Translate source text.
Usage
Translator$translate( ..., lang = language_get(), source_lang = language_source_get() )
Arguments
...Any number of vectors containing atomic elements. Each vector is normalized as a paragraph.
Elements are coerced to character values.
NA values and empty strings are discarded.
Multi-line strings are supported and encouraged. Blank lines are interpreted (two or more newline characters) as paragraph separators.
langA non-empty and non-NA character string. The underlying language.
A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility.
source_langA non-empty and non-NA character string. The language of the source text. See argument
langfor more information.
Details
See normalize() for further details on how ... is normalized.
Returns
A character string. If there is no corresponding translation,
the value passed to method $set_default_value() is
returned. NULL is returned by default.
Examples
tr <- Translator$new()
tr$set_text(en = "Hello, world!", fr = "Bonjour, monde!")
tr$translate("Hello, world!", lang = "en") ## Outputs "Hello, world!"
tr$translate("Hello, world!", lang = "fr") ## Outputs "Bonjour, monde!"
Method get_translation()
Extract a translation or a source text.
Usage
Translator$get_translation(hash = "", lang = "")
Arguments
hashA non-empty and non-NA character string. The unique identifier of the requested source text.
langA non-empty and non-NA character string. The underlying language.
A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility.
Returns
A character string. If there is no corresponding translation,
the value passed to method $set_default_value() is
returned. NULL is returned by default.
Examples
tr <- Translator$new()
tr$set_text(en = "Hello, world!")
# Consider using translate() instead.
tr$get_translation("256e0d7", "en") ## Outputs "Hello, world!"
Method get_text()
Extract a source text and its translations.
Usage
Translator$get_text(hash = "")
Arguments
hashA non-empty and non-NA character string. The unique identifier of the requested source text.
Returns
A Text object, or NULL.
Examples
tr <- Translator$new()
tr$set_text(en = "Hello, world!")
tr$get_translation("256e0d7", "en") ## Outputs "Hello, world!"
Method set_text()
Register a source text.
Usage
Translator$set_text(..., source_lang = language_source_get())
Arguments
Returns
A NULL, invisibly.
Examples
tr <- Translator$new() tr$set_text(en = "Hello, world!", location())
Method set_texts()
Register one or more source texts.
Usage
Translator$set_texts(...)
Arguments
...Any number of
Textobjects.
Details
This method calls merge_texts() to merge all values
passed to ... together with previously registered
Text objects. The underlying registered source texts,
translations, and Location objects won't be
duplicated.
Returns
A NULL, invisibly.
Examples
# Set source language.
language_source_set("en")
tr <- Translator$new()
# Create Text objects.
txt1 <- text(
location("a", 1L, 2L, 3L, 4L),
en = "Hello, world!",
fr = "Bonjour, monde!")
txt2 <- text(
location("b", 5L, 6L, 7L, 8L),
en = "Farewell, world!",
fr = "Au revoir, monde!")
tr$set_texts(txt1, txt2)
Method rm_text()
Remove a registered source text.
Usage
Translator$rm_text(hash = "")
Arguments
hashA non-empty and non-NA character string identifying the source text to remove.
Returns
A NULL, invisibly.
Examples
tr <- Translator$new()
tr$set_text(en = "Hello, world!")
tr$rm_text("256e0d7")
Method set_native_languages()
Map a language code to a native language name.
Usage
Translator$set_native_languages(...)
Arguments
...Any number of named, non-empty, and non-NA character strings. Names are codes and values are native languages. See field
native_languagesfor more information.
Returns
A NULL, invisibly.
Examples
tr <- Translator$new() tr$set_native_languages(en = "English", fr = "Français") # Remove existing entries. tr$set_native_languages(fr = NULL)
Method set_default_value()
Register a default value to return when there is no corresponding translations for the requested language.
Usage
Translator$set_default_value(value = NULL)
Arguments
valueA
NULLor a non-NA character string. It can be empty. The former is returned by default.
Details
This modifies what methods $translate() and
$get_translation() returns when there is no
translation for lang.
Returns
A NULL, invisibly.
Examples
tr <- Translator$new()
tr$set_default_value("<unavailable>")
See Also
find_source(),
translator_read(),
translator_write()
Examples
# Set source language.
language_source_set("en")
# Create a Translator object.
# This would normally be done automatically
# by find_source(), or translator_read().
tr <- translator(
id = "test-translator",
en = "English",
es = "Español",
fr = "Français",
text(
location("a", 1L, 2L, 3L, 4L),
en = "Hello, world!",
fr = "Bonjour, monde!"),
text(
location("b", 1L, 2L, 3L, 4L),
en = "Farewell, world!",
fr = "Au revoir, monde!"))
is_translator(tr)
# Translator objects has a specific format.
# print() calls format() internally, as expected.
print(tr)
## ------------------------------------------------
## Method `Translator$new`
## ------------------------------------------------
# Consider using translator() instead.
tr <- Translator$new()
## ------------------------------------------------
## Method `Translator$translate`
## ------------------------------------------------
tr <- Translator$new()
tr$set_text(en = "Hello, world!", fr = "Bonjour, monde!")
tr$translate("Hello, world!", lang = "en") ## Outputs "Hello, world!"
tr$translate("Hello, world!", lang = "fr") ## Outputs "Bonjour, monde!"
## ------------------------------------------------
## Method `Translator$get_translation`
## ------------------------------------------------
tr <- Translator$new()
tr$set_text(en = "Hello, world!")
# Consider using translate() instead.
tr$get_translation("256e0d7", "en") ## Outputs "Hello, world!"
## ------------------------------------------------
## Method `Translator$get_text`
## ------------------------------------------------
tr <- Translator$new()
tr$set_text(en = "Hello, world!")
tr$get_translation("256e0d7", "en") ## Outputs "Hello, world!"
## ------------------------------------------------
## Method `Translator$set_text`
## ------------------------------------------------
tr <- Translator$new()
tr$set_text(en = "Hello, world!", location())
## ------------------------------------------------
## Method `Translator$set_texts`
## ------------------------------------------------
# Set source language.
language_source_set("en")
tr <- Translator$new()
# Create Text objects.
txt1 <- text(
location("a", 1L, 2L, 3L, 4L),
en = "Hello, world!",
fr = "Bonjour, monde!")
txt2 <- text(
location("b", 5L, 6L, 7L, 8L),
en = "Farewell, world!",
fr = "Au revoir, monde!")
tr$set_texts(txt1, txt2)
## ------------------------------------------------
## Method `Translator$rm_text`
## ------------------------------------------------
tr <- Translator$new()
tr$set_text(en = "Hello, world!")
tr$rm_text("256e0d7")
## ------------------------------------------------
## Method `Translator$set_native_languages`
## ------------------------------------------------
tr <- Translator$new()
tr$set_native_languages(en = "English", fr = "Français")
# Remove existing entries.
tr$set_native_languages(fr = NULL)
## ------------------------------------------------
## Method `Translator$set_default_value`
## ------------------------------------------------
tr <- Translator$new()
tr$set_default_value("<unavailable>")
Read and Write Translations
Description
Export Translator objects to text files and import such
files back into R as Translator objects.
Usage
translator_read(
path = getOption("transltr.path"),
encoding = "UTF-8",
verbose = getOption("transltr.verbose", TRUE),
translations = TRUE
)
translator_write(
tr = translator(),
path = getOption("transltr.path"),
overwrite = FALSE,
verbose = getOption("transltr.verbose", TRUE),
translations = TRUE
)
translations_read(path = "", encoding = "UTF-8", tr = NULL)
translations_write(tr = translator(), path = "", lang = "")
translations_paths(
tr = translator(),
parent_dir = dirname(getOption("transltr.path"))
)
Arguments
path |
A non-empty and non-NA character string. A path to a file to read from, or write to.
See Details for more information. |
encoding |
A non-empty and non-NA character string. The source character encoding. In almost all cases, this should be UTF-8. Other encodings are internally re-encoded to UTF-8 for portability. |
verbose |
A non-NA logical value. Should progress information be reported? |
translations |
A non-NA logical value. Should translations files also
be read, or written along with |
tr |
A This argument is |
overwrite |
A non-NA logical value. Should existing files be
overwritten? If such files are detected and |
lang |
A non-empty and non-NA character string. The underlying language. A language is usually a code (of two or three letters) for a native language name. While users retain full control over codes, it is best to use language codes stemming from well-known schemes such as IETF BCP 47, or ISO 639-1 to maximize portability and cross-compatibility. |
parent_dir |
A non-empty and non-NA character string. A path to a parent directory. |
Details
The information contained within a Translator object is
split: translations are reorganized by language and exported independently
from other fields.
translator_write() creates two types of file: a single Translator file,
and zero, or more translations files. These are plain text files that can
be inspected and modified using a wide variety of tools and systems. They
target different audiences:
the Translator file is useful to developers, and
translations files are meant to be shared with non-technical collaborators such as translators.
translator_read() first reads a Translator file and creates a
Translator object from it. It then calls
translations_paths() to list expected translations files (that should
normally be stored alongside the Translator file), attempts to read them,
and registers successfully imported translations.
There are two requirements.
All files must be stored in the same directory. By default, this is set equal to
inst/transltr/(seegetOption("transltr.path")).Filenames of translations files are standardized and must correspond to languages (language codes, see
lang).
The inner workings of the serialization process are thoroughly described in
serialize().
Translator file
A Translator file contains a YAML (1.1)
representation of a Translator object stripped of all
its translations except those that are registered as source text.
Translations files
A translations file contains a FLAT representation of a set of translations sharing the same target language. This format attempts to be as simple as possible for non-technical collaborators.
Value
translator_read() returns an R6 object of class
Translator.
translator_write() returns NULL, invisibly. It is used for its
side-effects of
creating a Translator file to the location given by
path, andcreating further translations file(s) in the same directory if
translationsisTRUE.
translations_read() returns an S3 object of class
ExportedTranslations.
translations_write() returns NULL, invisibly.
translations_paths() returns a named character vector.
See Also
Examples
# Set source language.
language_source_set("en")
# Create a path to a temporary Translator file.
temp_path <- tempfile(pattern = "translator_", fileext = ".yml")
temp_dir <- dirname(temp_path) ## tempdir() could also be used
# Create a Translator object.
# This would normally be done by find_source(), or translator_read().
tr <- translator(
id = "test-translator",
en = "English",
es = "Español",
fr = "Français",
text(
en = "Hello, world!",
fr = "Bonjour, monde!"),
text(
en = "Farewell, world!",
fr = "Au revoir, monde!"))
# Export it. This creates 3 files: 1 Translator file, and 2 translations
# files because two non-source languages are registered. The file for
# language "es" contains placeholders and must be completed.
translator_write(tr, temp_path)
translator_read(temp_path)
# Translations can be read individually.
translations_files <- translations_paths(tr, temp_dir)
translations_read(translations_files[["es"]])
translations_read(translations_files[["fr"]])
# This is rarely useful, but translations can also be exported individually.
# You may use this to add a new language, as long as it has an entry in the
# underlying Translator object (or file).
tr$set_native_languages(el = "Greek")
translations_files <- translations_paths(tr, temp_dir)
translations_write(tr, translations_files[["el"]], "el")
translations_read(file.path(temp_dir, "el.txt"))
Universally Unique Identifiers
Description
Generate a random UUID (Universally Unique Identifier) that complies to what RFC4122 prescribes. Such a value is also known as a version 4 UUID.
Usage
uuid()
uuid_raw()
uuid_is(x)
Arguments
x |
An R object. |
Details
uuid() calls uuid_raw() and formats its output accordingly.
Pseudo-random bytes are generated with sample() whenever uuid_raw()
is called. This is most likely done before runtime when
Translator objects are created. uuid_raw() samples values
in the [0, 255] range with replacement and converts them to raw
values. The user must ensure that the underlying seed is appropriate when
generating UUIDs. See set.seed() for more information.
Value
uuid() returns a character of length 1 containing exactly 36
characters: 32 hexadecimal characters and 4 hyphens (used as separators).
uuid_raw() returns a raw vector of length 16.
uuid_is() returns a logical vector having the same length as x. It
checks whether its elements are valid version 4 (variant 1) UUIDs or
not. It returns FALSE for any other kind of UUID.
Note
UUIDs are designed to be globally unique (collisions are extremely unlikely) and are sometimes called GUIDs (Globally Unique Identifiers). There are several UUID versions with slightly different purposes.
Package transltr uses random identifiers (version 4/variant 1,
also known as DCE 1.1, ISO/IEC 11578:1996).
See Also
Examples
uuid()
uuid_raw()
uuid_is(uuid()) ## TRUE
uuid_is(uuid_raw()) ## FALSE, uuid_raw() does not return a string.
Apply Wrappers
Description
These functions wrap a function of the apply()
family, and enforce various values for convenience. Arguments are
passed as is to an apply() function.
Usage
vapply_1l(x, fun, ...)
vapply_1i(x, fun, ...)
vapply_1c(x, fun, ...)
map(fun, ..., more = list())
Arguments
x |
See argument |
fun |
|
... |
Further optional arguments passed to |
more |
See argument |
Value
vapply_1l(),
vapply_1l(), and
vapply_1c() respectively return a logical, an integer, and a character
vector having the same length as x. Names are always discarded.
map() returns a list having the same length as the longest element passed
to ....
See Also
Other utility functions:
format_vector(),
stops(),
str_to()