build(deps): bump org.jsoup:jsoup from 1.16.1 to 1.21.1
Type: Pull Request
State: Open
Association: Contributor
Comments: 1
(12 months ago)
(12 months ago)
dependencies java
bmc08gt
Bumps org.jsoup:jsoup from 1.16.1 to 1.21.1.
Release notes
Sourced from org.jsoup:jsoup's releases.
jsoup 1.21.1
jsoup 1.21.1 is out now, featuring powerful new node selection capabilities that let you target specific DOM nodes like comments and text nodes using CSS selectors, dynamic tag customization through the new TagSet callback system, and improved defense against mutation XSS attacks with simplified attribute escaping. This release also brings HTTP/2 support by default, numerous API improvements for better developer experience, and fixes for several edge-case parsing issues.
jsoup is a Java library for working with real-world HTML and XML. It provides a very convenient API for extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors.
Changes
- Removed previously deprecated methods. #2317
- Deprecated the
:matchTextpseduo-selector due to its side effects on the DOM; use the new::textnodeselector and theElement#selectNodes(String css, Class<T> type)method instead. #2343- Deprecated
Connection.Response#bufferUp()in lieu ofConnection.Response#readFully()which can throw a checked IOException.- Deprecated internal methods
Validate#ensureNotNull(Object)(replaced by typedValidate#expectNotNull(T)); protected HTML appenders from Attribute and Node.- If you happen to be using any of the deprecated methods, please take the opportunity now to migrate away from them, as they will be removed in a future release.
Improvements
- Enhanced the
Selectorto support direct matching against nodes such as comments and text nodes. For example, you can now find an element that follows a specific comment:::comment:contains(prices) + pwill selectpelements immediately after a<!-- prices: -->comment. Supported types include::node,::leafnode,::comment,::text,::data, and::cdata. Node contextual selectors like::node:contains(text),:matches(regex), and:blankare also supported. IntroducedElement#selectNodes(String css)andElement#selectNodes(String css, Class<T> nodeType)for direct node selection. #2324- Added
TagSet#onNewTag(Consumer<Tag> customizer): register a callback that’s invoked for each new or cloned Tag when it’s inserted into the set. Enables dynamic tweaks of tag options (for example, marking all custom tags as self-closing, or everything in a given namespace as preserving whitespace). #2330- Made
TokenQueueandCharacterReaderautocloseable, to ensure that they will release their buffers back to the buffer pool, for later reuse.- Added
Selector#evaluatorOf(String css), as a clearer way to obtain an Evaluator from a CSS query. An alias ofQueryParser.parse(String css).- Custom tags (defined via the
TagSet) in a foreign namespace (e.g. SVG) can be configured to parse as data tags.- Added
NodeVisitor#traverse(Node)to simplify node traversal calls (vs. importingNodeTraversor).- Updated the default user-agent string to improve compatibility. #2341
- The HTML parser now allows the specific text-data type (Data, RcData) to be customized for known tags. (Previously, that was only supported on custom tags.) #2326
- Added
Connection.Response#readFully()as a replacement forConnection.Response#bufferUp()with an explicit IOException. Similarly, addedConnection.Response#readBody()overConnection.Response#body(). DeprecatedConnection.Response#bufferUp(). #2327- When serializing HTML, the
<and>characters are now escaped in attributes. This helps prevent a class of mutation XSS attacks. #2337- Changed
Connectionto prefer using the JDK's HttpClient over HttpUrlConnection, if available, to enable HTTP/2 support by default. Users can disable via-Djsoup.useHttpClient=false. #2340Bug Fixes
- The contents of a
scriptin asvgforeign context should be parsed as script data, not text. #2320Tag#isFormSubmittable()was updating the Tag's options. #2323- The HTML pretty-printer would incorrectly trim whitespace when text followed an inline element in a block element. #2325
- Custom tags with hyphens or other non-letter characters in their names now work correctly as Data or RcData tags. Their closing tags are now tokenized properly. #2332
- When cloning an Element, the clone would retain the source's cached child Element list (if any), which could lead to incorrect results when modifying the clone's child elements. #2334
jsoup 1.20.1
Changes
- To better follow the HTML5 spec and current browsers, the HTML parser no longer allows self-closing tags (
<foo />) to close HTML elements by default. Foreign content (SVG, MathML), and content parsed with the XML parser, still supports self-closing tags. If you need specific HTML tags to support self-closing, you can register a custom tag via theTagSetconfigured inParser.tagSet(), usingTag#set(Tag.SelfClose). Standard void tags (such as<img>,<br>, etc.) continue to behave as usual and are not affected by this change. #2300.- The following internal components have been deprecated. If you do happen to be using any of these, please take the opportunity now to migrate away from them, as they will be removed in jsoup 1.21.1.
ChangeNotifyingArrayList,Document.updateMetaCharsetElement(),Document.updateMetaCharsetElement(boolean),HtmlTreeBuilder.isContentForTagData(String),Parser.isContentForTagData(String),Parser.setTreeBuilder(TreeBuilder),Tag.formatAsBlock(),Tag.isFormListed(),TokenQueue.addFirst(String),TokenQueue.chompTo(String),TokenQueue.chompToIgnoreCase(String),TokenQueue.consumeToIgnoreCase(String),TokenQueue.consumeWord(),TokenQueue.matchesAny(String...)Functional Improvements
- Rebuilt the HTML pretty-printer, to simplify and consolidate the implementation, improve consistency, support custom Tags, and provide a cleaner path for ongoing improvements. The specific HTML produced by the pretty-printer may be different from previous versions. #2286.
- Added the ability to define custom tags, and to modify properties of known tags, via the
TagSettag collection. Their properties can impact both the parse and how content is serialized (output as HTML or XML). #2285.Element.cssSelector()will prefer to return shorter selectors by using ancestor IDs when available and unique. E.g.#id > div > pinstead ofhtml > body > div > div > p#2283.- Added
Elements.deselect(int index),Elements.deselect(Object o), andElements.deselectAll()methods to remove elements from theElementslist without removing them from the underlying DOM. Also addedElements.asList()method to get a modifiable list of elements without affecting the DOM. (Individual Elements remain linked to the DOM.) #2100.- Added support for sending a request body from an InputStream with
Connection.requestBodyStream(InputStream stream). #1122.- The XML parser now supports scoped xmlns: prefix namespace declarations, and applies the correct namespace to Tags and Attributes. Also, added
Tag#prefix(),Tag#localName(),Attribute#prefix(),Attribute#localName(), andAttribute#namespace()to retrieve these. #2299.- CSS identifiers are now escaped and unescaped correctly to the CSS spec.
Element#cssSelector()will emit appropriately escaped selectors, and the QueryParser supports those. AddedSelector.escapeCssIdentifier()and ` Selector.unescapeCssIdentifier(). #2297, #2305
... (truncated)
Changelog
Sourced from org.jsoup:jsoup's changelog.
1.21.1 (2025-Jun-23)
Changes
- Removed previously deprecated methods. #2317
- Deprecated the
:matchTextpseduo-selector due to its side effects on the DOM; use the new::textnodeselector and theElement#selectNodes(String css, Class type)method instead. #2343- Deprecated
Connection.Response#bufferUp()in lieu ofConnection.Response#readFully()which can throw a checked IOException.- Deprecated internal methods
Validate#ensureNotNull(replaced by typedValidate#expectNotNull); protected HTML appenders from Attribute and Node.- If you happen to be using any of the deprecated methods, please take the opportunity now to migrate away from them, as they will be removed in a future release.
Improvements
- Enhanced the
Selectorto support direct matching against nodes such as comments and text nodes. For example, you can now find an element that follows a specific comment:::comment:contains(prices) + pwill selectpelements immediately after a<!-- prices: -->comment. Supported types include::node,::leafnode,::comment,::text,::data, and::cdata. Node contextual selectors like::node:contains(text),:matches(regex), and:blankare also supported. IntroducedElement#selectNodes(String css)andElement#selectNodes(String css, Class nodeType)for direct node selection. #2324- Added
TagSet#onNewTag(Consumer<Tag> customizer): register a callback that’s invoked for each new or cloned Tag when it’s inserted into the set. Enables dynamic tweaks of tag options (for example, marking all custom tags as self-closing, or everything in a given namespace as preserving whitespace).- Made
TokenQueueandCharacterReaderautocloseable, to ensure that they will release their buffers back to the buffer pool, for later reuse.- Added
Selector#evaluatorOf(String css), as a clearer way to obtain an Evaluator from a CSS query. An alias ofQueryParser.parse(String css).- Custom tags (defined via the
TagSet) in a foreign namespace (e.g. SVG) can be configured to parse as data tags.- Added
NodeVisitor#traverse(Node)to simplify node traversal calls (vs. importingNodeTraversor).- Updated the default user-agent string to improve compatibility. #2341
- The HTML parser now allows the specific text-data type (Data, RcData) to be customized for known tags. (Previously, that was only supported on custom tags.) #2326.
- Added
Connection#readFully()as a replacement forConnection#bufferUp()with an explicit IOException. Similarly, addedConnection#readBody()overConnection#body(). DeprecatedConnection#bufferUp(). #2327- When serializing HTML, the
<and>characters are now escaped in attributes. This helps prevent a class of mutation XSS attacks. #2337- Changed
Connectionto prefer using the JDK's HttpClient over HttpUrlConnection, if available, to enable HTTP/2 support by default. Users can disable via-Djsoup.useHttpClient=false. #2340Bug Fixes
- The contents of a
scriptin asvgforeign context should be parsed as script data, not text. #2320Tag#isFormSubmittable()was updating the Tag's options. #2323- The HTML pretty-printer would incorrectly trim whitespace when text followed an inline element in a block element. #2325
- Custom tags with hyphens or other non-letter characters in their names now work correctly as Data or RcData tags. Their closing tags are now tokenized properly. #2332
- When cloning an Element, the clone would retain the source's cached child Element list (if any), which could lead to incorrect results when modifying the clone's child elements. #2334
1.20.1 (2025-Apr-29)
Changes
- To better follow the HTML5 spec and current browsers, the HTML parser no longer allows self-closing tags (
<foo />) to close HTML elements by default. Foreign content (SVG, MathML), and content parsed with the XML parser, still supports self-closing tags. If you need specific HTML tags to support self-closing, you can register a custom tag via theTagSetconfigured inParser.tagSet(), usingTag#set(Tag.SelfClose). Standard void tags (such as<img>,<br>, etc.) continue to behave as usual and are not affected by this change. #2300.- The following internal components have been deprecated. If you do happen to be using any of these, please take the opportunity now to migrate away from them, as they will be removed in jsoup 1.21.1.
ChangeNotifyingArrayList,Document.updateMetaCharsetElement(),Document.updateMetaCharsetElement(boolean),HtmlTreeBuilder.isContentForTagData(String),Parser.isContentForTagData(String),Parser.setTreeBuilder(TreeBuilder),Tag.formatAsBlock(),Tag.isFormListed(),TokenQueue.addFirst(String),TokenQueue.chompTo(String),TokenQueue.chompToIgnoreCase(String),TokenQueue.consumeToIgnoreCase(String),TokenQueue.consumeWord(),TokenQueue.matchesAny(String...)Functional Improvements
- Rebuilt the HTML pretty-printer, to simplify and consolidate the implementation, improve consistency, support custom Tags, and provide a cleaner path for ongoing improvements. The specific HTML produced by the pretty-printer may be different from previous versions. #2286.
- Added the ability to define custom tags, and to modify properties of known tags, via the
TagSettag collection. Their properties can impact both the parse and how content is
... (truncated)
Commits
9a059f4[maven-release-plugin] prepare release jsoup-1.21.1a9f6ad0Prep 1.21.1 release63ed60bTidy up exception testa4d451fImproved unhandled node type error msgcf88221Added::cdatanode selector893706aDeprecate:matchTextselector (#2343)2a73678Added javadoc note for Connection#timeout3f70665Fix date format2f48c65Updated the default UA42dbaa0Cleanup redundant Syntax parameter- Additional commits viewable in compare view
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebasewill rebase this PR@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it@dependabot mergewill merge this PR after your CI passes on it@dependabot squash and mergewill squash and merge this PR after your CI passes on it@dependabot cancel mergewill cancel a previously requested merge and block automerging@dependabot reopenwill reopen this PR if it is closed@dependabot closewill close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Pull Request Statistics
0
0
+0
-0
Package Dependencies
Technical Details
| ID: | 2075655 |
| UUID: | 3168776462 |
| Node ID: | PR_kwDOKxjbMM6bstpp |
| Host: | GitHub |
| Repository: | code-payments/code-android-app |