pypyr.steps.filereplace
find & replace arbitrary strings in a file
Parses input text file and replaces any given search strings.
The other fileformat steps, by way of contradistinction, uses string formatting expressions inside {braces} to format values against the pypyr context.
This step, however, lets you find any arbitrary search string and replace it
with any replacement string. This is especially handy if you are working with a
file where curly braces aren’t helpful for a formatting expression - e.g
inside a .js or a terraform .ts file. filereplace
is more useful than
fileformat
when you do not control the input file format and you have to
replace strings as they are in the given file.
Example input context:
fileReplace:
in: ./infile.txt
out: ./outfile.txt
replacePairs:
findmestring: replacewithme
findanotherstring: replacewithanotherstring
alaststring: alastreplacement
See a worked example of filereplace.
multiple files & globs
fileReplace
expects the following context keys:
fileReplace
in
- Mandatory path(s) to source file on disk.
- This can be a string path to a single file, or a glob, or a list of paths and/or globs.
- Each path can be a glob, a relative or absolute path.
out
(optional)- Write output file to here. Will create directories in path if these do not exist already.
out
is optional. If not specified, will edit thein
files in-place.- If in-path refers to >1 file (e.g it’s a glob or list), out path can only be a directory - it doesn’t make sense to write >1 file to the same single file output (this is not an appender.)
- To ensure out path means a directory and not a file,
be sure to have the os’ path separator at the end (
/
on a sensible filesystem). - If you specify an
out
directory without a file-name, out files will have the same name they had inin
.
replacePairs
- dictionary where format is:
'find string': 'replace string'
- dictionary where format is:
See file format settings for more examples on in/out path handling - the same processing rules apply.
Here is an example with globs in a list. You can also pass a single glob as
in
directly as a string, not in a list.
fileReplace:
in:
# ** recurses sub-dirs per usual globbing
- ./testfiles/replace/sub/**
- ./testfiles/replace/*.ext
# note the dir separator at the end.
# since >1 in files, out can only be a dir.
out: ./out/replace/
replacePairs:
findmestring: replacewithme
If you do not specify out
, it will over-write (i.e edit) all the files
specified by in
.
fileReplace:
# in-place edit/overwrite all the files in
in: ./infile.txt
replacePairs:
findmestring: replacewithme
The file in
and out
paths support
substitutions.
replace with substitution expressions
fileReplace
also does substitutions from
context on the replacePairs
for both the search string and the replacement
string. pypyr does this before it search & replaces the in
files. This allows
you to specify dynamic search strings and dynamic replacement values.
Given an input file like this:
“the beginning thy REPEATING-WORD, thy happy REPEATING-WORD,
Sing thy songs of happy cheer!”
So I LOOK FOR ME the same again,
<<manywords>> to arbPyExpression.
And a pipeline like this:
steps:
- name: pypyr.steps.filereplace
comment: replace multiple string in a file.
in:
k1: LOOK FOR ME
k2: wept with
fileReplace:
in: ./filereplace.in
out: ./filereplace.out
replacePairs:
# replace many words with one word
the beginning: Drop
# replace the same term multiple times in same file
REPEATING-WORD: pipe
# dynamically set search term using formatting expression
'{k1}': sang
# dynamically set replacement term
# sometimes angled brackets make search terms easier for humans to read.
<<manywords>>: While he {k2} joy
# you can also use py strings - here we reverse a string with python.
arbPyExpression: !py '"raeh"[::-1]'
The resulting output file is:
“Drop thy pipe, thy happy pipe,
Sing thy songs of happy cheer!”
So I sang the same again,
While he wept with joy to hear.
replacement order
Be careful of order. The last string replacement expression could well replace a replacement that an earlier replacement made in the sequence.
- name: pypyr.steps.filereplace
in:
fileReplace:
in: ./in.txt
out: ./out.txt
replacePairs:
replaceme: INTERMEDIATE
# later replaces can affect earlier replaces
INTERMEDIATE: final
Given the input file:
some text replaceme end.
The output is:
some text final end.
Although this example is a bit contrived, it is something to watch out for when you are using dynamic string expressions where it might not be immediately obvious that a later search string is finding an earlier replacement.
If replacePairs
is not an ordered collection, replacements could
evaluate in any given order. This is relevant if you are a coder.
If you are creating your in
parameters in
the pipeline yaml, don’t worry about it, it will be an ordered
dictionary already, so life is good.
substitutions on paths
The file in and out paths support substitutions, which allows you to specify paths dynamically.
- name: pypyr.steps.set
comment: set some arb values in context
in:
set:
myfilename: input-file
myoutputfile: out/output.txt
- name: pypyr.steps.filereplace
comment: you can set in & out entirely or partially with formatting expressions
in:
fileReplace:
in: testfiles/{myfilename}.ext
out: '{myoutputfile}'
replacePairs:
arb: value
special characters
If your search or replacement string has {curly braces}, pypyr will treat what is inside the curly brace as a normal pypyr formatting substitution expression.
If you want to search for and replace strings with literal {curlies}, you have to escape the literals by {{doubling}} the braces.
Given an input file like this:
Piping down the {arb braces} wild
Piping manywords
And a pipeline like this:
steps:
- name: pypyr.steps.filereplace
comment: escape literal curly braces by doubling
in:
arb_key: of pleasant
fileReplace:
in:
- ./in/*.txt
- ./in/sub/*
out: ./out/replace/
replacePairs:
# notice escaping literal curlies by doubling
'{{arb braces}}': valleys
# {arb_key} will replace from content
manywords: songs {arb_key} glee
The output is:
Piping down the valleys wild
Piping songs of pleasant glee
encoding
By default in
will read and out
will write in the platform’s default
encoding. This is utf-8
for most systems, but be aware on Windows it’s still
cp1252
.
You can use the encoding
input explicitly to set the encoding:
- name: pypyr.steps.filereplace
comment: set encoding
in:
fileReplace:
in: testfiles/infile.txt
out: testfiles/outfile.txt
encoding: utf-8
replacePairs:
lookforme: replacewithme
You can also individually set the encoding for in
and out
. This allows you
to convert a file from one encoding to another:
- name: pypyr.steps.filereplace
comment: set encoding
in:
fileReplace:
in: testfiles/infile.txt
out: testfiles/outfile.txt
encodingIn: ascii
encodingOut: utf-16
replacePairs:
lookforme: replacewithme
All of these are optional - if you do not explicitly over-ride the encoding for
either in
or out
, pypyr will just use the system default.
See here for more details on handling text encoding in pypyr and changing the defaults.
See here for a list of available encoding codecs.