Skip to content

feat: add scene analysis API#969

Open
cuidong233 wants to merge 2 commits into
OpenWebGAL:mainfrom
cuidong233:parser-audio-arg-boundary-tests
Open

feat: add scene analysis API#969
cuidong233 wants to merge 2 commits into
OpenWebGAL:mainfrom
cuidong233:parser-audio-arg-boundary-tests

Conversation

@cuidong233
Copy link
Copy Markdown

Summary\n- add a parser-level scene analysis API for WebGAL scripts\n- expose SceneParser.analyze() and analyzeScene() for editor/tooling use\n- collect labels, scene references, choices, assets, variable usage, command summaries, and diagnostics\n- report duplicate labels, missing labels, unknown scenes, empty choices/assets, and simple unreachable-flow hints\n\n## Review notes\n- Feature-focused code change for WebGAL/GalGame script tooling\n- Diff size: 849 insertions\n- Production code: 825 insertions across sceneAnalyzer.ts and parser exports\n- Test coverage is intentionally small: 24 inserted lines only to verify the public API\n\n## Testing\n- cd packages/parser && yarn vitest run parser.test.ts\n- cd packages/parser && yarn build-ci

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new scene analyzer module (sceneAnalyzer.ts) and integrates it into SceneParser to perform static analysis on scene files, validating labels, references, choices, assets, variables, and control flow. Feedback on these changes highlights several issues: a potential ReDoS vulnerability and state preservation bug in extractVariableNames due to global regex usage, false positives in control flow validation for conditional terminal commands, path resolution mismatches with leading slashes in isSceneResolved, and incorrect line number reporting (line 0) for assets without explicit line numbers.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +724 to +737
function extractVariableNames(expression: string): string[] {
const withoutStrings = expression.replace(/(["'`])(?:\\.|(?!\1).)*\1/g, ' ');
const names = new Set<string>();
let match: RegExpExecArray | null;

while ((match = VARIABLE_TOKEN_RE.exec(withoutStrings)) !== null) {
const token = match[0];
if (!RESERVED_VARIABLE_WORDS.has(token.toLowerCase()) && Number.isNaN(Number(token))) {
names.add(token);
}
}

return Array.from(names);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The VARIABLE_TOKEN_RE regular expression is defined with the global (g) flag at the module level. Using exec in a loop on a global regex maintains state (lastIndex) across multiple invocations of extractVariableNames, leading to incorrect parsing or skipped matches on subsequent calls. Additionally, the regex used to strip strings /(["'])(?:\.|(?!\1).)*\1/gcontains overlapping paths that can lead to exponential backtracking (ReDoS) on unmatched quotes with many backslashes. We can resolve both issues by using safe non-backreferencing string matching andString.prototype.match()instead ofexec()`.

function extractVariableNames(expression: string): string[] {
  const withoutStrings = expression.replace(/\"(?:\\.|[^\"\\])*\"|'(?:\\.|[^'\\])*'|`(?:\\.|[^`\\])*`/g, ' ');
  const names = new Set<string>();
  const matches = withoutStrings.match(VARIABLE_TOKEN_RE) || [];

  for (const token of matches) {
    if (!RESERVED_VARIABLE_WORDS.has(token.toLowerCase()) && Number.isNaN(Number(token))) {
      names.add(token);
    }
  }

  return Array.from(names);
}

Comment on lines +600 to +623
context.scene.sentenceList.forEach((sentence, index) => {
if (!TERMINAL_COMMANDS.has(sentence.command)) {
return;
}

const nextMeaningful = findNextMeaningfulSentence(context.scene.sentenceList, index + 1);
if (!nextMeaningful) {
return;
}

const [nextSentence, nextIndex] = nextMeaningful;
if (nextSentence.command === commandType.label) {
return;
}

pushWarning(
context,
nextSentence,
nextIndex,
'unreachable-after-terminal-command',
'info',
'This sentence follows a terminal command before the next label.',
);
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If a terminal command (such as changeScene) has a conditional when argument, the control flow is not guaranteed to terminate. Marking subsequent sentences as unreachable in this case will result in false positives. We should skip control flow validation for commands that have a when argument.

  context.scene.sentenceList.forEach((sentence, index) => {
    if (!TERMINAL_COMMANDS.has(sentence.command)) {
      return;
    }

    const isConditional = sentence.args.some((arg) => arg.key === 'when');
    if (isConditional) {
      return;
    }

    const nextMeaningful = findNextMeaningfulSentence(context.scene.sentenceList, index + 1);
    if (!nextMeaningful) {
      return;
    }

    const [nextSentence, nextIndex] = nextMeaningful;
    if (nextSentence.command === commandType.label) {
      return;
    }

    pushWarning(
      context,
      nextSentence,
      nextIndex,
      'unreachable-after-terminal-command',
      'info',
      'This sentence follows a terminal command before the next label.',
    );
  });

Comment on lines +770 to +777
function isSceneResolved(context: AnalyzerContext, target: string): boolean {
if (context.knownScenes.size === 0) {
return true;
}

const normalized = normalizeSceneTarget(target);
return context.knownScenes.has(normalized) || context.knownScenes.has(stripLeadingSlash(normalized));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent mismatches between known scenes and referenced scenes due to inconsistent leading slashes (e.g., /start.txt vs start.txt), we should check both the normalized path, the stripped path, and the absolute-style path.

Suggested change
function isSceneResolved(context: AnalyzerContext, target: string): boolean {
if (context.knownScenes.size === 0) {
return true;
}
const normalized = normalizeSceneTarget(target);
return context.knownScenes.has(normalized) || context.knownScenes.has(stripLeadingSlash(normalized));
}
function isSceneResolved(context: AnalyzerContext, target: string): boolean {
if (context.knownScenes.size === 0) {
return true;
}
const normalized = normalizeSceneTarget(target);
const stripped = stripLeadingSlash(normalized);
return (
context.knownScenes.has(normalized) ||
context.knownScenes.has(stripped) ||
context.knownScenes.has("/" + stripped)
);
}

Comment on lines +585 to +595
if (!asset.url || asset.url.trim().length === 0) {
context.warnings.push({
severity: 'warning',
code: 'empty-asset-url',
message: `Asset "${asset.name}" has an empty URL.`,
lineNumber: asset.lineNumber,
sentenceIndex: asset.sentenceIndex,
command: asset.sourceCommand,
commandRaw: asset.sourceCommandRaw,
});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If asset.lineNumber is 0 (which is common for assets parsed without explicit line numbers), the warning will report line 0. We should fall back to the sentence's actual line number using getSentenceLineNumber to ensure a valid 1-based line number is always reported.

    if (!asset.url || asset.url.trim().length === 0) {
      const sentence = context.scene.sentenceList[asset.sentenceIndex];
      const lineNumber = asset.lineNumber > 0 ? asset.lineNumber : getSentenceLineNumber(sentence, asset.sentenceIndex);
      context.warnings.push({
        severity: 'warning',
        code: 'empty-asset-url',
        message: "Asset \"" + asset.name + "\" has an empty URL.",
        lineNumber,
        sentenceIndex: asset.sentenceIndex,
        command: asset.sourceCommand,
        commandRaw: asset.sourceCommandRaw,
      });
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant