pypyr.steps.filereplace permalink

find & replace arbitrary strings in a file permalink

Parses input text file and replaces any given search strings.

The other fileformat steps, by way of contradistinction, uses string formatting expressions inside {braces} to format values against the pypyr context.

This step, however, lets you find any arbitrary search string and replace it with any replacement string. This is especially handy if you are working with a file where curly braces aren’t helpful for a formatting expression - e.g inside a .js or a terraform .ts file. filereplace is more useful than fileformat when you do not control the input file format and you have to replace strings as they are in the given file.

Example input context:

fileReplace:
  in: ./infile.txt
  out: ./outfile.txt
  replacePairs:
    findmestring: replacewithme
    findanotherstring: replacewithanotherstring
    alaststring: alastreplacement

See a worked example of filereplace.

multiple files & globs permalink

fileReplace expects the following context keys:

  • fileReplace
    • in
      • Mandatory path(s) to source file on disk.
      • This can be a string path to a single file, or a glob, or a list of paths and/or globs.
      • Each path can be a glob, a relative or absolute path.
    • out (optional)
      • Write output file to here. Will create directories in path if these do not exist already.
      • out is optional. If not specified, will edit the in files in-place.
      • If in-path refers to >1 file (e.g it’s a glob or list), out path can only be a directory - it doesn’t make sense to write >1 file to the same single file output (this is not an appender.)
      • To ensure out path means a directory and not a file, be sure to have the os’ path separator at the end (/ on a sensible filesystem).
      • If you specify an out directory without a file-name, out files will have the same name they had in in.
    • replacePairs
      • dictionary where format is:
        • 'find string': 'replace string'

See file format settings for more examples on in/out path handling - the same processing rules apply.

Here is an example with globs in a list. You can also pass a single glob as in directly as a string, not in a list.

fileReplace:
  in:
    # ** recurses sub-dirs per usual globbing
    - ./testfiles/replace/sub/**
    - ./testfiles/replace/*.ext
  # note the dir separator at the end.
  # since >1 in files, out can only be a dir.
  out: ./out/replace/
  replacePairs:
      findmestring: replacewithme

If you do not specify out, it will over-write (i.e edit) all the files specified by in.

fileReplace:
  # in-place edit/overwrite all the files in
  in: ./infile.txt
  replacePairs:
    findmestring: replacewithme

The file in and out paths support substitutions.

replace with substitution expressions permalink

fileReplace also does substitutions from context on the replacePairs for both the search string and the replacement string. pypyr does this before it search & replaces the in files. This allows you to specify dynamic search strings and dynamic replacement values.

Given an input file like this:

“the beginning thy REPEATING-WORD, thy happy REPEATING-WORD,
   Sing thy songs of happy cheer!”
So I LOOK FOR ME the same again,
   <<manywords>> to arbPyExpression.

And a pipeline like this:

steps:
  - name: pypyr.steps.filereplace
    comment: replace multiple string in a file.
    in:
      k1: LOOK FOR ME
      k2: wept with
      fileReplace:
        in: ./filereplace.in
        out: ./filereplace.out
        replacePairs:
          # replace many words with one word
          the beginning: Drop
          # replace the same term multiple times in same file
          REPEATING-WORD: pipe
          # dynamically set search term using formatting expression
          '{k1}': sang
          # dynamically set replacement term 
          # sometimes angled brackets make search terms easier for humans to read.
          <<manywords>>: While he {k2} joy
          # you can also use py strings - here we reverse a string with python.
          arbPyExpression: !py '"raeh"[::-1]'

The resulting output file is:

“Drop thy pipe, thy happy pipe,
   Sing thy songs of happy cheer!”
So I sang the same again,
   While he wept with joy to hear.

replacement order permalink

Be careful of order. The last string replacement expression could well replace a replacement that an earlier replacement made in the sequence.

- name: pypyr.steps.filereplace
  in:
    fileReplace:
      in: ./in.txt
      out: ./out.txt
      replacePairs:
        replaceme: INTERMEDIATE
        # later replaces can affect earlier replaces
        INTERMEDIATE: final

Given the input file:

some text replaceme end.

The output is:

some text final end.

Although this example is a bit contrived, it is something to watch out for when you are using dynamic string expressions where it might not be immediately obvious that a later search string is finding an earlier replacement.

If replacePairs is not an ordered collection, replacements could evaluate in any given order. This is relevant if you are a coder.

If you are creating your in parameters in the pipeline yaml, don’t worry about it, it will be an ordered dictionary already, so life is good.

substitutions on paths permalink

The file in and out paths support substitutions, which allows you to specify paths dynamically.

- name: pypyr.steps.set
  comment: set some arb values in context
  in:
    set:
      myfilename: input-file
      myoutputfile: out/output.txt

- name: pypyr.steps.filereplace
  comment: you can set in & out entirely or partially with formatting expressions
  in:
    fileReplace:
        in: testfiles/{myfilename}.ext
        out: '{myoutputfile}'
        replacePairs:
          arb: value

special characters permalink

If your search or replacement string has {curly braces}, pypyr will treat what is inside the curly brace as a normal pypyr formatting substitution expression.

If you want to search for and replace strings with literal {curlies}, you have to escape the literals by {{doubling}} the braces.

Given an input file like this:

Piping down the {arb braces} wild
Piping manywords

And a pipeline like this:

steps:
  - name: pypyr.steps.filereplace
    comment: escape literal curly braces by doubling
    in:
      arb_key: of pleasant
      fileReplace:
        in:
          - ./in/*.txt
          - ./in/sub/*
        out: ./out/replace/
        replacePairs:
            # notice escaping literal curlies by doubling
            '{{arb braces}}': valleys
            # {arb_key} will replace from content
            manywords: songs {arb_key} glee

The output is:

Piping down the valleys wild
Piping songs of pleasant glee

encoding permalink

By default in will read and out will write in the platform’s default encoding. This is utf-8 for most systems, but be aware on Windows it’s still cp1252.

You can use the encoding input explicitly to set the encoding:

- name: pypyr.steps.filereplace
  comment: set encoding
  in:
    fileReplace:
      in: testfiles/infile.txt
      out: testfiles/outfile.txt
      encoding: utf-8
      replacePairs:
        lookforme: replacewithme

You can also individually set the encoding for in and out. This allows you to convert a file from one encoding to another:

- name: pypyr.steps.filereplace
  comment: set encoding
  in:
    fileReplace:
      in: testfiles/infile.txt
      out: testfiles/outfile.txt
      encodingIn: ascii
      encodingOut: utf-16
      replacePairs:
        lookforme: replacewithme

All of these are optional - if you do not explicitly over-ride the encoding for either in or out, pypyr will just use the system default.

See here for more details on handling text encoding in pypyr and changing the defaults.

See here for a list of available encoding codecs.

see also

last updated on .