Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Blocker
Component/s: workflow-basic-steps-plugin
Labels:
None
Environment:
Jenkins 2.121.2 and Jenkins 2.81 Pipeline Groovy Plugin 2.54

Similar Issues:

Show

I'm extracting xml file (nuspec) from some nuget packages and trying to parse it. In most cases it works fine, but in some the xml was written using UTF-8 with BOM encoding, and then parser gets upset and reports:

org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.

The way I'm parsing xml is:

@NonCPS
def parsePackage(packageName, packageVersion) {
    def packageFullName = "${packageName}.${packageVersion}"
  bat """curl -L https://www.nuget.org/api/v2/package/${packageName}/${packageVersion} -o ${packageFullName}.nupkg"""
  bat """unzip ${packageFullName}.nupkg -d ${packageFullName}"""

  def nuspecPath = """${packageFullName}\\${packageName}.nuspec"""
  def nuspecContent = readFile file:nuspecPath
  def nuspecXML = new XmlSlurper( false, false ).parseText(nuspecContent)
  println nuspecXML.metadata.version
  
  def newXml = XmlUtil.serialize(nuspecXML)
  return newXml
}

It looks like readFile is not supporting UTF-8 with BOM as it is passing leading BOM characters into returned string.

I tried to replicate it directly in groovy doing

def xmldata = new File("Newtonsoft.Json.nuspec").text
def pkg = new XmlSlurper().parseText(xmldata) 
println pkg.metadata.version.text()

But here the leading BOM characters are not passed into xmldata variable

Attached example nuspec with BOM in it.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Newtonsoft.Json.nuspec
2 kB
2018-10-04 10:43

Assignee:: Unassigned

Reporter:: Jakub Pawlinski

Votes:: 1 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2018-10-04 10:30

Updated:: 2019-09-09 10:53

Details

Description

Attachments

Attachments

Activity

People

Dates