Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: legacy json generator #92

Closed
wants to merge 18 commits into from
Closed

feat: legacy json generator #92

wants to merge 18 commits into from

Conversation

flakey5
Copy link
Member

@flakey5 flakey5 commented Sep 5, 2024

Description

Legacy json generator aiming to repro 1:1 the existing json format for docs

Still a sizeable amount left to be done, so opening this as a draft.

TODO

  • fix type references (i.e. Returns: {boolean} instead of Returns: [](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Data_structures#Boolean_type))
  • descriptions
  • introduced_in doesn't work?
  • Invalid param "{ lineLengths }" error

Validation

mkdir original && mkdir new

node node/tools/doc/generate.mjs node/doc/api/addons.md --output-directory original/

node api-docs-tooling/bin/cli.mjs -i node/doc/api/addons.md -t legacy-json -o new/

git diff original new

@flakey5 flakey5 marked this pull request as ready for review September 24, 2024 03:30
@flakey5 flakey5 requested a review from a team as a code owner September 24, 2024 03:30
@flakey5
Copy link
Member Author

flakey5 commented Sep 24, 2024

Marking ready for review since there shouldn't be any big changes left, but note this isn't generating a full 1:1 match with the original generator yet. There's still some fixes and testing I need to do

@ovflowd
Copy link
Member

ovflowd commented Sep 24, 2024

Marking ready for review since there shouldn't be any big changes left, but note this isn't generating a full 1:1 match with the original generator yet. There's still some fixes and testing I need to do

Can you describe some of the notorious changes? Some discrepancies might be fine :)


const previousEntry = entries[i - 1];

const previousDepth = previousEntry.heading.depth;
Copy link
Member

@ovflowd ovflowd Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting way of iterating this. Wouldn't it be easier to use the heading section topology we already created, which has heading info that tells which depth you are based on the heading depth?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be easier to use the heading section topology we already created

Wdym?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking you can use unified here, to iterate through the headings, (which are nodes) and then automatically add children to each parent.

I think I could explain this over a call, but pretty much using the visit API to self reference the previous heading in terms of depth, and then visit next levels.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fear is that the code below is too much convoluted and hacky 🙈

src/metadata.mjs Outdated Show resolved Hide resolved
src/generators/legacy-json-all/index.mjs Outdated Show resolved Hide resolved
src/generators/legacy-json/constants.mjs Outdated Show resolved Hide resolved
src/generators/legacy-json/index.mjs Outdated Show resolved Hide resolved
package.json Show resolved Hide resolved
@flakey5 flakey5 requested a review from ovflowd October 9, 2024 00:29
flakey5 and others added 15 commits October 8, 2024 17:29
Closes #57

Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Co-authored-by: Augustin Mauroy <augustin.mauroy@outlook.fr>
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Co-authored-by: Caner Akdas <canerakdas@gmail.com>
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Co-authored-by: Augustin Mauroy <augustin.mauroy@outlook.fr>
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
@ovflowd
Copy link
Member

ovflowd commented Oct 12, 2024

@flakey5 what is still missing here?

return section;
};

await Promise.all(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use a for here? To avoid using Promise.all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what was originally done, then after an initial review, we put a Promise.all instead of a series of awaits in a loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I stuttering O.o -- I still believe await would be better here for readability

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in terms of code readability, it's the same for me. But from what I've been able to understand in this kind of case it's better for performance to use Promise.all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been able to understand in this kind of case it's better for performance to use Promise.all.

Why?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Unfortunately I don't remember much about it, but I'd seen a blog post about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, but that would only be the case if there's threading. It is correct to say that with the Promise approach at least the Promises will all be dispatched in parallel, but they will be executed one per time.

So in the end the feeling might be that it is faster, although it might not be. I do believe await syntax would better suit here, but no strong feelings.

return {
textRaw: text,
type: head.data.type,
name: text.toLowerCase().replaceAll(' ', '_'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document why this is done

parameter = undefined;

// Try finding a parameter this is being shared with
for (const otherParam of rawParameters) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit convoluted. Do you believe there's something you can do here to optimise the code?

I can see a few ways of optimising the signature generation and reducing its footprint.

function textJoin(nodes) {
return nodes
.map(node => {
switch (node.type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the body of this map be extracted to a function?

/**
* @param {Array<import('mdast').PhrasingContent>} nodes
*/
function textJoin(nodes) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you just making this Markdown-ish? We already have this https://github.com/nodejs/api-docs-tooling/blob/main/src/utils/unist.mjs#L12

}

// Render the description as if it was html
section.desc = unified()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to do this, can you simply call this? https://github.com/nodejs/api-docs-tooling/blob/main/src/utils/remark.mjs#L20

We don't need remark-html, remark-rehype is good enough and does the same :)

(remark-rehype then rehype-stringify)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also remove the depenendecy of remark-html

Copy link
Member

@ovflowd ovflowd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flakey5 I left a few comments, could you please attempt to address'em and dive deep into simplifying and extracting complex logic?

ovflowd

This comment was marked as duplicate.

@ovflowd ovflowd dismissed their stale review October 12, 2024 19:56

Accidentally approved.

Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
* @param {Array<import('../types.d.ts').List} markdownParameters The properties in the AST
* @returns {import('../types.d.ts').MethodSignature | undefined}
*/
export default (textRaw, markdownParameters) => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo this is broken still

Comment on lines +33 to +48
function createMeta(entry) {
const makeArrayIfNotAlready = val => (Array.isArray(val) ? val : [val]);

const { added_in, n_api_version, deprecated_in, removed_in, changes } = entry;
return {
changes,
added: added_in ? makeArrayIfNotAlready(added_in) : undefined,
napiVersion: n_api_version
? makeArrayIfNotAlready(n_api_version)
: undefined,
deprecated: deprecated_in
? makeArrayIfNotAlready(deprecated_in)
: undefined,
removed: removed_in ? makeArrayIfNotAlready(removed_in) : undefined,
};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function createMeta(entry) {
const makeArrayIfNotAlready = val => (Array.isArray(val) ? val : [val]);
const { added_in, n_api_version, deprecated_in, removed_in, changes } = entry;
return {
changes,
added: added_in ? makeArrayIfNotAlready(added_in) : undefined,
napiVersion: n_api_version
? makeArrayIfNotAlready(n_api_version)
: undefined,
deprecated: deprecated_in
? makeArrayIfNotAlready(deprecated_in)
: undefined,
removed: removed_in ? makeArrayIfNotAlready(removed_in) : undefined,
};
}
function createMeta(entry) {
const makeArrayIfNotAlready = val => (Array.isArray(val) ? val : [val]);
const { added_in, n_api_version, deprecated_in, removed_in, changes } = entry;
if (!added_in && !n_api_version && !deprecated_in && !removed_in && changes.length < 1) {
return undefined;
}
return {
changes: changes.length > 0 ? changes : undefined,
added: added_in ? makeArrayIfNotAlready(added_in) : undefined,
napiVersion: n_api_version
? makeArrayIfNotAlready(n_api_version)
: undefined,
deprecated: deprecated_in
? makeArrayIfNotAlready(deprecated_in)
: undefined,
removed: removed_in ? makeArrayIfNotAlready(removed_in) : undefined,
};
}

In the old generator, if there was no metadata, the property was emitted entirely.

@avivkeller
Copy link
Member

avivkeller commented Nov 9, 2024

Hey @flakey5. I've made some local modifications to this, and it now generates (kind of?) 1:1 code (still need to work out some kinks). Some of my changes are suggestions in the review above, but if you'd like, I can open a PR into your branch?

@avivkeller
Copy link
Member

avivkeller commented Nov 10, 2024

Good news! My local changes are almost 1:1 with the original JSON. The differences are:

  1. The HTML here is generated with Shiki, not HLJS
  2. This parser seems to be more accurate when it comes to signature parsing and optionals detection.

Changes: 44d12f5
Diff: https://github.com/nodejs/api-docs-tooling/commit/44d12f577c95d398ceed6c5077dd9e99d623f843.diff

Co-authored-by: Aviv Keller <redyetidev@gmail.com>
@flakey5
Copy link
Member Author

flakey5 commented Nov 10, 2024

I can open a PR into your branch?

That sgtm! Thank you!

@avivkeller
Copy link
Member

I tried to open a PR, but it has a conflict, probably because I rebased.

You can still apply the changes with git apply and the diff file in the comment above

https://git-scm.com/docs/git-apply

@avivkeller
Copy link
Member

I now believe my modifications are as close to 1:1 as possible. I'll do some final checks, but if that's the case, I'll open a second PR stemming off of this one, as I can't merge them due to conflicts.

@avivkeller avivkeller mentioned this pull request Nov 10, 2024
2 tasks
@avivkeller
Copy link
Member

I hope you don't mind I opened #142. If you mind, you can use the data from that PR and merge it into this one, but that'll have conflicts, and I don't have the access to do it myself.

@flakey5
Copy link
Member Author

flakey5 commented Nov 12, 2024

Superseded by #142

@flakey5 flakey5 closed this Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants