PyYAML - Использование различных стилей для ключей и целых чисел и строк

Обновить

April 2019

Просмотры

739 раз

1
--- 
"main": 
  "directory": 
    "options": 
      "directive": 'options'
      "item": 
        "options": 'Stuff OtherStuff MoreStuff'
  "directoryindex": 
    "item": 
      "directoryindex": 'stuff.htm otherstuff.htm morestuff.html'
  "fileetag": 
    "item": 
      "fileetag": 'Stuff'
  "keepalive": 
    "item": 
      "keepalive": 'Stuff'
  "keepalivetimeout": 
    "item": 
      "keepalivetimeout": 2

выше, является YAML файл, который мне нужно разобрать, редактировать, то дамп. Я решил сделать это с PyYAML на Python 2.7 (мне нужно использовать это). Я был в состоянии разобрать и редактировать.

Однако, поскольку YAML имеет различные стили для ключей и различных стилей для строк и чисел я не могу установить стиль по умолчанию. Я теперь интересно, как я могу использовать PyYAML сбрасывать различные стили для различных типов.

Ниже то, что я делаю, чтобы разобрать и отредактировать

infile = yaml.load(open('yamlfile'))

#Recursive function to loop through nested dictionary
def edit(d,keytoedit=None,newvalue=None):
  for key, value in d.iteritems():
    if isinstance(value, dict) and key == keytoedit and 'item' in value:
      value[value.iterkeys().next()] = {keytoedit:newvalue}
      edit(value,keytoedit=keytoedit,newvalue=newvalue)
    elif isinstance(value, dict) and keytoedit in value and 'item' not in value and key != 'main':
      value[keytoedit] = newvalue
      edit(value,keytoedit=keytoedit,newvalue=newvalue)
    elif isinstance(value, dict):
      edit(value,keytoedit=keytoedit,newvalue=newvalue)

outfile = file('outfile','w')
yaml.dump(infile, outfile,default_flow_style=False)

Итак, мне интересно, как я могу добиться того, что, если я использую default_style в yaml.dump всех типов получить тот же стиль, и мне нужно придерживаться первоначального стандарта YAML файлов.

Могу ли я каким-то образом указать стили для конкретных типов с PyYAML?

Edit: Вот что я так далеко, недостающий кусок двойные qoutes на клавишах и одиночных qoutes на струнах.

main:
  directory:
    options:
      directive: options
      item:
        options: Stuff OtherStuff MoreStuff
  directoryindex:
    item:
      directoryindex: stuff.html otherstuff.htm morestuff.html
  fileetag:
    item:
      fileetag: Stuff
  keepalive:
    item:
      keepalive: 'On'
  keepalivetimeout:
    item:
      keepalivetimeout: 2

1 ответы

1

You can at least preserve the original flow/block style for the various elements with the normal yaml.dump() for some value of "normal".

What you need is a loader that saves the flow/bcock style information while reading the data, subclass the normal types that have the style (mappings/dicts resp. sequences/lists) so that they behave like the python constructs normally returned by the loader, but have the style information attached. Then on the way out using yaml.dump you provide a custom dumper that takes this style information into account.

I use the normal yaml.dump in my enhanced version of PyYAML called ruamel.yaml, but have special loader and dumper class RoundTripDumper (and a RoundTripLoader for yaml.load) that preserve the flow/block style (and any comments you might have in the file:

import ruamel.yaml as yaml

infile = yaml.load(open('yamlfile'), Loader=yaml.RoundTripLoader)

for key, value in infile['main'].items():
    if key == 'keepalivetimeout':
        item = value['item']
        item['keepalivetimeout'] = 400

print yaml.dump(infile, Dumper=yaml.RoundTripDumper)

gives you:

main:
  directory:
    options:
      directive: options
      item:
        options: Stuff OtherStuff MoreStuff
  directoryindex:
    item:
      directoryindex: stuff.htm otherstuff.htm morestuff.html
  fileetag:
    item:
      fileetag: Stuff
  keepalive:
    item:
      keepalive: Stuff
  keepalivetimeout:
    item:
      keepalivetimeout: 400

If you cannot install ruamel.yaml you can pull out the code from my repository and include it in your code, AFAIK PyYAML has not been upgraded since I started working on this.

I currently don't preserve the superfluous quote on the scalars, but I do preserve the chomping information (for multiline statements starting with '|'. That information is thrown out really early on in the input processing of the YAML file and would require multiple changes to be preserved.

Since you seem to be having different quotes for key and value string scalars, you can achieve the output you want by overriding process_scalar (part of the Emitter in emitter.py) to add the quotes based on the string scalar being a key or not and being an integer or not:

import ruamel.yaml as yaml

# the scalar emitter from emitter.py
def process_scalar(self):
    if self.analysis is None:
        self.analysis = self.analyze_scalar(self.event.value)
    if self.style is None:
        self.style = self.choose_scalar_style()
    split = (not self.simple_key_context)
    # VVVVVVVVVVVVVVVVVVVV added
    try:
        x = int(self.event.value)  # might need to expand this
    except:
        # we have string
        if split:
            self.style = "'"
        else:
            self.style = '"'
    # ^^^^^^^^^^^^^^^^^^^^
    # if self.analysis.multiline and split    \
    #         and (not self.style or self.style in '\'\"'):
    #     self.write_indent()
    if self.style == '"':
        self.write_double_quoted(self.analysis.scalar, split)
    elif self.style == '\'':
        self.write_single_quoted(self.analysis.scalar, split)
    elif self.style == '>':
        self.write_folded(self.analysis.scalar)
    elif self.style == '|':
        self.write_literal(self.analysis.scalar)
    else:
        self.write_plain(self.analysis.scalar, split)
    self.analysis = None
    self.style = None
    if self.event.comment:
        self.write_post_comment(self.event)


infile = yaml.load(open('yamlfile'), Loader=yaml.RoundTripLoader)

for key, value in infile['main'].items():
    if key == 'keepalivetimeout':
        item = value['item']
        item['keepalivetimeout'] = 400

dd = yaml.RoundTripDumper
dd.process_scalar = process_scalar

print '---'
print yaml.dump(infile, Dumper=dd)

gives you:

---
"main":
  "directory":
    "options":
      "directive": 'options'
      "item":
        "options": 'Stuff OtherStuff MoreStuff'
  "directoryindex":
    "item":
      "directoryindex": 'stuff.htm otherstuff.htm morestuff.html'
  "fileetag":
    "item":
      "fileetag": 'Stuff'
  "keepalive":
    "item":
      "keepalive": 'Stuff'
  "keepalivetimeout":
    "item":
      "keepalivetimeout": 400

which is quite close to what you asked for.