commit 3b323cde342ec0f79d6c7c65c57fe6a2e9660176 from: Aleksey Ryndin date: Fri Aug 02 13:34:38 2024 UTC Fix: re-implement (TempStringMax limitation) commit - 3a4930680e0b3b7e67dc7caff51cd9b2064e80a3 commit + 3b323cde342ec0f79d6c7c65c57fe6a2e9660176 blob - a82c515de940a30810efca70c11eba80bb99cdf2 blob + 5f0968154aff79c37ca5195c4ed09ff30dd2478d --- gtransl.retro +++ gtransl.retro @@ -2,47 +2,59 @@ ~~~ '\r\n s:format 'CRLF s:const -:vgi:user-input '10_🔁_Text: CRLF s:append s:put #0 unix:exit ; -:vgi:not-found '51_Not_found CRLF s:append s:put #0 unix:exit ; -:vgi:bad-request '59_Bad_request CRLF s:append s:put #0 unix:exit ; -:vgi:tmp-failure '40_Unexpected_error CRLF s:append s:put #0 unix:exit ; +:vgi:print-line (s-) CRLF s:append s:put ; +:vgi:user-input '10_🔁_Text: vgi:print-line #0 unix:exit ; +:vgi:not-found '51_Not_found vgi:print-line #0 unix:exit ; +:vgi:bad-request '59_Bad_request vgi:print-line #0 unix:exit ; +:vgi:tmp-failure '40_Unexpected_error vgi:print-line #0 unix:exit ; ~~~ +По спецификации Gemini запрашиваемый URL не может превышать 1024 байта. +Поэтому на хранение URL выделяем 1024 ячеек + ячейку для терминирующего нуля. +~~~ +'url:Buffer d:create #1025 allot +:url:get (-) + url:Buffer buffer:set + #1024 [ c:get dup ASCII:CR eq? [ drop ] if; dup ASCII:LF eq? [ drop ] if; buffer:add ] times +; +url:get +~~~ + Разбираем запрашиваемый URL из стандратного потока ввода: * всё, что до /gtransl/ слева отбрасываем вместе с этим компонентом пути; -* следующий компонент пути - исходный язык (константа SL); -* следующий компонент пути - целевой язык перевода (константа TL); +* следующий компонент пути - исходный язык (константа url:SL); +* следующий компонент пути - целевой язык перевода (константа url:TL); * если входной URL закончен, то нужно выдать пользователю запрос на ввод переводимой строки; -* последним извлекаем query строку с текстом, который требуется перевести (константа QUERY). +* последним извлекаем query строку с текстом, который требуется перевести (константа url:QUERY). ~~~ -:cut-required-path (s-s) - dup '/gtransl/ s:index/string dup n:negative? [ vgi:not-found ] if - over s:length swap - '/gtransl/ s:length - #1 - s:right +'/gtransl/ 'url:REQUIRED-PATH s:const +:url:required-path (s-s) + dup url:REQUIRED-PATH s:index/string dup n:negative? [ vgi:not-found ] if + + url:REQUIRED-PATH s:length + n:inc ; -:drop-fist-char (s-s) dup s:length #1 - s:right ; -:extract-path-part (ss-s) - swap dup $/ s:index/char n:negative? [ vgi:not-found ] if - $/ s:split/char rot s:const - drop-fist-char +:url:path-part (ss-s) + swap dup $/ s:index/char dup n:negative? [ vgi:not-found ] if + dup @TempStringMax gteq? [ vgi:bad-request ] if + over over s:left + rot rot + rot rot swap s:const + n:inc ; -:user-input? dup s:length n:zero? [ vgi:user-input ] if ; -:required-query (s-s) dup #0 s:fetch $? -eq? [ vgi:not-found ] if ; -:extract-query (s-) required-query drop-fist-char 'QUERY s:const ; -s:get - cut-required-path 'SL extract-path-part 'TL extract-path-part - user-input? - extract-query +:url:check-user-input (s-s) dup s:length n:zero? [ vgi:user-input ] if ; +:url:required-query (s-s) dup fetch $? -eq? [ vgi:not-found ] if ; +:url:query (s-) url:check-user-input url:required-query n:inc 'url:QUERY const ; + +url:Buffer url:required-path 'url:SL url:path-part 'url:TL url:path-part url:query ~~~ Проверяем валидность содержимого строк, что бы исключить возможность исполнения произвольной команды. (Command Injection) ~~~ -:check-lang (s-) [ $a $z n:between? [ vgi:not-found ] -if ] s:for-each ; -SL check-lang -TL check-lang +:url:check-lang (s-) [ $a $z n:between? [ vgi:not-found ] -if ] s:for-each ; +url:SL url:check-lang +url:TL url:check-lang -:check-query (s-) +:url:check-query (s-) [ dup $a $z n:between? [ drop ] if; dup $A $Z n:between? [ drop ] if; @@ -56,33 +68,38 @@ TL check-lang vgi:bad-request ] s:for-each ; -QUERY check-query +url:QUERY url:check-query ~~~ Результат выполнения команды curl будем хранить в буфере из 16-ти килоячеек (+ ячейка для ASCII:NULL) ~~~ :html:BUFFER_SIZE #16384 ; -'html:Buffer d:create html:BUFFER_SIZE #1 + allot +'html:Buffer d:create html:BUFFER_SIZE n:inc allot ~~~ Для выполнения HTTPS-запроса используем curl или (для тестов) cat из существующего файла с HTML-ответом. Чтение результата выполнения команды curl по переданному описателю пайпа происходит до чтения 0 символа (не байта). Считаем, что в этом случае пайп закрыт другой стороной. ~~~ -:html:read-buffer (h-h) - html:Buffer buffer:set html:BUFFER_SIZE [ dup file:read/c 0; buffer:add ] times +:pipe:read (h-h) html:Buffer buffer:set html:BUFFER_SIZE [ dup file:read/c 0; buffer:add ] times ; +'CurlCommand d:create #3072 allot +:buffer:append (s-) [ buffer:add ] s:for-each ; +:pipe:command-curl (-s) + CurlCommand buffer:set + 'curl_-m_5_-s_--url-query_sl=" buffer:append + url:SL buffer:append + '"_--url-query_tl=" buffer:append + url:TL buffer:append + '"_"https://translate.google.com/m?q= buffer:append + url:QUERY buffer:append + '" buffer:append + CurlCommand ; -:html:pipe-command-curl (-s) - QUERY TL SL 'curl_-m_5_-s_--url-query_sl=%s_--url-query_tl=%s_"https://translate.google.com/m?q=%s" s:format -; -:html:pipe-command (-s) - 'GTRANSLRESPFILE s:empty [ unix:getenv ] sip dup s:length n:zero? - [ drop html:pipe-command-curl ] [ 'cat_%s s:format ] choose -; -:html:do-request - html:pipe-command file:R unix:popen html:read-buffer unix:pclose -; +:get-env (s-s) s:empty [ unix:getenv ] sip ; +:zero-length? (s-s) dup s:length n:zero? ; +:pipe:command (-s) 'GTRANSLRESPFILE get-env zero-length? [ drop pipe:command-curl ] [ 'cat_%s s:format ] choose ; +:pipe:request pipe:command file:R unix:popen pipe:read unix:pclose ; ~~~ Рабоче-крестьянский парсинг HTML: @@ -94,35 +111,63 @@ QUERY check-query Затем в результате заменяем наиболее часто-используемые escape-последовательности HTML на символы. ~~~ -:html:get-result-container (-s) - html:do-request +:html:index/char (sc-n) s:index/char dup n:negative? [ vgi:tmp-failure ] if ; +:html:index/result-container (s-n) '"result-container" s:index/string dup n:negative? [ vgi:tmp-failure ] if ; +:buffer:append-n (sn-) [ dup I s:fetch buffer:add ] indexed-times drop ; +:html:cut-head (sn-s) over buffer:set + repeat dup fetch 0; buffer:add n:inc again ; +:html:unquote? (ssc-sf) + rot rot over over s:begins-with? [ drop drop drop FALSE ] -if; + s:length n:dec rot rot store swap dup n:inc rot html:cut-head drop TRUE +; +:html:unquote-head (s-s) + dup '& $& html:unquote? [ ] if; + dup '& $& html:unquote? [ ] if; + dup '& $& html:unquote? [ ] if; - html:Buffer '"result-container" s:index/string dup n:negative? [ vgi:tmp-failure ] if - html:Buffer s:length swap - html:Buffer swap s:right + dup '< $< html:unquote? [ ] if; + dup '< $< html:unquote? [ ] if; + dup '< $< html:unquote? [ ] if; + dup '< $< html:unquote? [ ] if; - dup $> s:index/char dup n:negative? [ vgi:tmp-failure ] if - swap dup s:length rot - s:right + dup '> $> html:unquote? [ ] if; + dup '> $> html:unquote? [ ] if; + dup '> $> html:unquote? [ ] if; + dup '> $> html:unquote? [ ] if; - dup $< s:index/char #1 - #1 swap s:substr + dup '" $" html:unquote? [ ] if; + dup '" $" html:unquote? [ ] if; + dup '" $" html:unquote? [ ] if; - '& '& s:replace-all '& '& s:replace-all '& '& s:replace-all - '< '< s:replace-all '< '< s:replace-all '< '< s:replace-all '< '< s:replace-all - '> '> s:replace-all '> '> s:replace-all '> '> s:replace-all '> '> s:replace-all - '" '" s:replace-all '" '" s:replace-all '" '" s:replace-all - '' '' s:replace-all '' '' s:replace-all '' '' s:replace-all + dup '' $' html:unquote? [ ] if; + dup '' $' html:unquote? [ ] if; + dup '' $' html:unquote? [ ] if; ; +:html:unquote (s-s) + repeat + dup fetch 0; $& eq? [ html:unquote-head ] if n:inc + again +; +:html:get-result-container (-s) + pipe:request + html:Buffer html:index/result-container html:Buffer + + dup $> html:index/char + n:inc + dup $< html:index/char + html:Buffer buffer:set buffer:append-n + html:Buffer html:unquote drop + html:Buffer +; ~~~ Отдаём результат в формате "text/gemini" в стандартный поток вывода ~~~ -html:get-result-container -'20_text/gemini CRLF s:append s:put -'#_🔁_GTransl CRLF s:append s:put +html:get-result-container (*1) +'20_text/gemini vgi:print-line +'#_🔁_GTransl vgi:print-line CRLF s:put -CRLF SL 'From:_%s%s s:format s:put -CRLF TL 'To:___%s%s s:format s:put +CRLF url:SL 'From:_%s%s s:format s:put +CRLF url:TL 'To:___%s%s s:format s:put CRLF s:put -'``` CRLF s:append s:put -(s-) CRLF s:append s:put -'``` CRLF s:append s:put +'``` vgi:print-line +(*1) vgi:print-line +'``` vgi:print-line ~~~ blob - 8fec8baad98ef610e4eb3dc983a414fa8dc107c8 (mode 644) blob + /dev/null Binary files tests/.tests.sh.swp and /dev/null differ blob - /dev/null blob + 70f11d44fef4c29794449581c69a2d9794a5874f (mode 644) --- /dev/null +++ tests/resp_escaped-05.html @@ -0,0 +1,112 @@ +Google Translate
Translate
<&&>
blob - /dev/null blob + 237ac13458ba4edd2454a53154bf7ccebc1f5533 (mode 644) --- /dev/null +++ tests/resp_limit-01.html @@ -0,0 +1,155 @@ +Google Translate
Translate
Однажды в полночь тоскливую, +Когда я размышлял, слабый и усталый, +Над многими странными и любопытными +Томами забытых знаний — +Пока я кивал, почти дремля, +Вдруг раздался стук, +Как будто кто-то тихонько стучал, +Стучал в дверь моей комнаты. +"Это какой-то гость", — пробормотал я, +"Стучит в дверь моей комнаты +Только это и ничего больше". + +Ах, отчетливо я помню, +Это было в унылом декабре, +И каждый умирающий уголек +Свой призрак на полу создавал. +С нетерпением я желал завтрашнего дня; +Тщетно я пытался позаимствовать +Из своих книг прекращение печали +Печаль по потерянной Ленор — +По редкой и лучезарной деве +Которую ангелы называют Ленор — +Безымянной здесь навеки.
\ No newline at end of file blob - da2a7cd65e33a46bf0e8919df18a0640a8d30a9b blob + b54dcd8d1b57d2589951c4a9840201c030883327 --- tests/tests.sh +++ tests/tests.sh @@ -17,6 +17,10 @@ echo "gemini://any-key.press/vgi/gtransl/auto/?\" ; ls echo "Bad request tests..." echo "gemini://any-key.press/vgi/gtransl/auto/ru/?\" ; ls" | ./gtransl.retro | head -n 1 | \ grep "^59 Bad request" > /dev/null && echo "passed" || echo "FAILED" +echo "gemini://any-key.press/vgi/gtransl/llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll/ru/?q" | ./gtransl.retro | head -n 1 | \ + grep "^59 Bad request" > /dev/null && echo "passed" || echo "FAILED" +echo "gemini://any-key.press/vgi/gtransl/en/llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll/?q" | ./gtransl.retro | head -n 1 | \ + grep "^59 Bad request" > /dev/null && echo "passed" || echo "FAILED" echo "Escaping tests..." echo "gemini://any-key.press/vgi/gtransl/sl/tl/?query" | \ @@ -31,6 +35,9 @@ echo "gemini://any-key.press/vgi/gtransl/sl/tl/?query" echo "gemini://any-key.press/vgi/gtransl/sl/tl/?query" | \ GTRANSLRESPFILE=tests/resp_escaped-04.html ./gtransl.retro | head -n 8 | tail -n 1 | \ grep "^lt=< gt=>" > /dev/null && echo "passed" || echo "FAILED" +echo "gemini://any-key.press/vgi/gtransl/sl/tl/?query" | \ + GTRANSLRESPFILE=tests/resp_escaped-05.html ./gtransl.retro | head -n 8 | tail -n 1 | \ + grep "^<&&>" > /dev/null && echo "passed" || echo "FAILED" echo "Multiline tests..." echo "gemini://any-key.press/vgi/gtransl/sl/tl/?query" | \ @@ -39,3 +46,9 @@ echo "gemini://any-key.press/vgi/gtransl/sl/tl/?query" echo "gemini://any-key.press/vgi/gtransl/sl/tl/?query" | \ GTRANSLRESPFILE=tests/resp_multiline-01.html ./gtransl.retro | head -n 9 | tail -n 1 | \ grep "^world" > /dev/null && echo "passed" || echo "FAILED" +echo "gemini://any-key.press/vgi/gtransl/auto/ru/?Once%20upon%20a%20midnight%20dreary%2C%0AWhile%20I%20pondered%2C%20weak%20and%20weary%2C%0AOver%20many%20a%20quaint%20and%20curious%0AVolume%20of%20forgotten%20lore%E2%80%94%0AWhile%20I%20nodded%2C%20nearly%20napping%2C%0ASuddenly%20there%20came%20a%20tapping%2C%0AAs%20of%20some%20one%20gently%20rapping%2C%0ARapping%20at%20my%20chamber%20door.%0A%22%27T%20is%20some%20visitor%2C%22%20I%20muttered%2C%0A%22Tapping%20at%20my%20chamber%20door%0AOnly%20this%20and%20nothing%20more.%22%20%0AAh%2C%20distinctly%20I%20remember%2C%0AIt%20was%20in%20the%20bleak%20December%2C%0AAnd%20each%20separate%20dying%20ember%0AWrought%20its%20ghost%20upon%20the%20floor.%0AEagerly%20I%20wished%20the%20morrow%3B%0AVainly%20I%20had%20sought%20to%20borrow%0AFrom%20my%20books%20surcease%20of%20sorrow%0ASorrow%20for%20the%20lost%20Lenore%E2%80%94%0AFor%20the%20rare%20and%20radiant%20maiden%0AWhom%20the%20angels%20name%20Lenore%E2%80%94%0ANameless%20here%20for%20evermore." | \ + GTRANSLRESPFILE=tests/resp_limit-01.html ./gtransl.retro | head -n 8 | tail -n 1 | + grep "^Однажды в полночь тоскливую," > /dev/null && echo "passed" || echo "FAILED" +echo "gemini://any-key.press/vgi/gtransl/auto/ru/?Once%20upon%20a%20midnight%20dreary%2C%0AWhile%20I%20pondered%2C%20weak%20and%20weary%2C%0AOver%20many%20a%20quaint%20and%20curious%0AVolume%20of%20forgotten%20lore%E2%80%94%0AWhile%20I%20nodded%2C%20nearly%20napping%2C%0ASuddenly%20there%20came%20a%20tapping%2C%0AAs%20of%20some%20one%20gently%20rapping%2C%0ARapping%20at%20my%20chamber%20door.%0A%22%27T%20is%20some%20visitor%2C%22%20I%20muttered%2C%0A%22Tapping%20at%20my%20chamber%20door%0AOnly%20this%20and%20nothing%20more.%22%20%0AAh%2C%20distinctly%20I%20remember%2C%0AIt%20was%20in%20the%20bleak%20December%2C%0AAnd%20each%20separate%20dying%20ember%0AWrought%20its%20ghost%20upon%20the%20floor.%0AEagerly%20I%20wished%20the%20morrow%3B%0AVainly%20I%20had%20sought%20to%20borrow%0AFrom%20my%20books%20surcease%20of%20sorrow%0ASorrow%20for%20the%20lost%20Lenore%E2%80%94%0AFor%20the%20rare%20and%20radiant%20maiden%0AWhom%20the%20angels%20name%20Lenore%E2%80%94%0ANameless%20here%20for%20evermore." | \ + GTRANSLRESPFILE=tests/resp_limit-01.html ./gtransl.retro | head -n 30 | tail -n 1 | \ + grep "^Безымянной здесь навеки." > /dev/null && echo "passed" || echo "FAILED"