0
Commit Graph

309 Commits

Author SHA1 Message Date
rafaelsideguide
d4574851be Added rpc definition 2024-05-15 08:40:21 -03:00
rafaelsideguide
47c20c80ab Update auth.ts 2024-05-15 08:34:49 -03:00
Ikko Eltociear Ashimine
e91c122c69
Merge branch 'main' into patch-1 2024-05-15 12:14:52 +09:00
Nicolas
7d8ceab6de Merge branch 'feat/rate-limits' of https://github.com/mendableai/firecrawl into feat/rate-limits 2024-05-14 14:48:01 -07:00
Nicolas
0e0faa28b3 Update auth.ts 2024-05-14 14:47:36 -07:00
rafaelsideguide
672eddb999 updated rpc 2024-05-14 18:47:21 -03:00
Nicolas
4761ea510b Update rate-limiter.ts 2024-05-14 14:26:42 -07:00
rafaelsideguide
40ad97dee8 added rate limits 2024-05-14 18:08:31 -03:00
Nicolas
27e1e22a0a Update index.test.ts 2024-05-14 12:28:25 -07:00
Nicolas
a0fdc6f7c6 Nick: 2024-05-14 12:12:40 -07:00
Nicolas
7f31959be7 Nick: 2024-05-14 12:04:36 -07:00
Nicolas
8a72cf556b Nick: 2024-05-13 21:10:58 -07:00
Nicolas
a96fc5b96d Nick: 4x speed 2024-05-13 20:45:11 -07:00
Nicolas
e26008a833 Merge branch 'main' of https://github.com/mendableai/firecrawl 2024-05-13 19:54:13 -07:00
Nicolas
512449e1aa Nick: v21 2024-05-13 19:54:12 -07:00
Nicolas
bd27b0e17e
Merge pull request #142 from mendableai/doc/crawl-limit-default
[Doc] Added default value for crawlOptions.limit
2024-05-13 18:38:09 -07:00
Nicolas
aa0c8188c9 Nick: 408 handling 2024-05-13 18:34:00 -07:00
Nicolas
999176d576 Merge branch 'main' of https://github.com/mendableai/firecrawl 2024-05-13 13:57:34 -07:00
Nicolas
f3ec21d9c4 Update runWebScraper.ts 2024-05-13 13:57:22 -07:00
Nicolas
65d89afba9 Nick: 2024-05-13 13:01:43 -07:00
Eric Ciarla
4cc46d4af8 Update models.ts 2024-05-13 15:23:31 -04:00
rafaelsideguide
8eb2e95f19 Cleaned up 2024-05-13 16:13:10 -03:00
Nicolas
2ce045912f Nick: disable vision right now 2024-05-13 10:56:08 -07:00
rafaelsideguide
f4348024c6 Added check during scraping to deal with pdfs
Checks if the URL is a PDF during the scraping process (single_url.ts).

TODO: Run integration tests - Does this strat affect the running time?

ps. Some comments need to be removed if we decide to proceed with this strategy.
2024-05-13 09:13:42 -03:00
Rafael Miller
5a2712fa5a
Merge branch 'main' into detect-pdfs 2024-05-10 15:53:13 -03:00
rafaelsideguide
df16890f84 Added default value for crawlOptions.limit 2024-05-10 11:59:33 -03:00
rafaelsideguide
18480b2005 Removed .env.example, improved docs and docker compose envs 2024-05-10 11:38:17 -03:00
Nicolas
66bd1e4020 Update website_params.ts 2024-05-09 18:41:15 -07:00
Nicolas
c02a82c282 Update main.py 2024-05-09 18:02:34 -07:00
Nicolas
efc6fcb474 Merge branch 'main' of https://github.com/mendableai/firecrawl 2024-05-09 18:01:04 -07:00
Nicolas
73687822ad Update main.py 2024-05-09 18:00:58 -07:00
Nicolas
d21091bb06 Update single_url.ts 2024-05-09 17:52:46 -07:00
Nicolas
be85008622 Nick: better 2024-05-09 17:48:11 -07:00
Nicolas
be5661a768 Nick: a lot better 2024-05-09 17:45:16 -07:00
Nicolas
fce17e6beb Update credit_billing.ts 2024-05-09 15:29:58 -07:00
rafaelsideguide
f4d8b2c89a Updated docs 2024-05-09 10:36:56 -03:00
Nicolas
aa6b84c5fa Nick: readme 2024-05-08 17:41:15 -07:00
Nicolas
d9da4b53f8 Update example.py 2024-05-08 17:36:40 -07:00
Nicolas
4c88d5da66 Nick: v8 python 2024-05-08 17:35:16 -07:00
Nicolas
e6dbbf1bab Nick: fixes js and pydantic implementation 2024-05-08 17:16:59 -07:00
Nicolas
c89964b230 Nick: 2024-05-08 16:38:49 -07:00
Nicolas
9541ff6b30 Nick: 429 addressed 2024-05-08 15:14:39 -07:00
Nicolas
3bfef646e0 Update index.test.ts 2024-05-08 13:23:53 -07:00
Nicolas
6ced8e73a7 Update index.test.ts 2024-05-08 13:13:38 -07:00
Nicolas
c50076c377 Update websites.json 2024-05-08 13:04:17 -07:00
Nicolas
1296928879 Update index.test.ts 2024-05-08 13:00:20 -07:00
Nicolas
4a5f87623c
Merge pull request #118 from mendableai/feat/test-suite
[Test] Added integration tests suite
2024-05-08 12:47:17 -07:00
Nicolas
fb7a8fd73f Delete test_screenshot.png 2024-05-08 12:39:32 -07:00
Nicolas
c635688ddb Nick: test suite 2024-05-08 12:36:54 -07:00
Nicolas
d34b4de6ac Update websites.json 2024-05-08 12:27:45 -07:00
Nicolas
a0a67f124a Update index.test.ts 2024-05-08 12:26:04 -07:00
Nicolas
b7e3104c7b Ni 2024-05-08 12:18:53 -07:00
Nicolas
ad58bc2820 Nick: test suite init 2024-05-08 11:38:46 -07:00
Eric Ciarla
d280bcadf3 Add keyAuth 2024-05-07 13:52:42 -04:00
Nicolas
056b0ec24d Merge branch 'main' into feat/test-suite 2024-05-07 10:41:09 -07:00
Nicolas
dcedb8d798 Merge branch 'main' into feat/max-depth 2024-05-07 10:20:49 -07:00
Nicolas
6505bf6bf2 Merge branch 'main' into feat/max-depth 2024-05-07 10:20:44 -07:00
Nicolas
bdbee963f7 Merge branch 'main' into nsc/cancel-job 2024-05-07 10:13:43 -07:00
rafaelsideguide
61d615c04b Added tests 2024-05-07 14:03:00 -03:00
rafaelsideguide
e1f52c538f nested includeHtml inside pageOptions 2024-05-07 13:40:24 -03:00
Nicolas
f46bf19fa5 Nick: 2024-05-07 09:26:52 -07:00
rafaelsideguide
83f3408634 Added max depth option 2024-05-07 11:06:26 -03:00
Nicolas
2e3ff85509 Update crawl-cancel.ts 2024-05-06 17:22:16 -07:00
Nicolas
6d5da358cc Nick: cancel job 2024-05-06 17:16:43 -07:00
rafaelsideguide
509250c4ef changed to includeHtml 2024-05-06 19:45:56 -03:00
rafaelsideguide
538355f1af Added toMarkdown option 2024-05-06 11:36:44 -03:00
Nicolas
d1b6f6dcde Update fly.toml 2024-05-04 13:49:09 -07:00
Nicolas
cd9a0840b5 Update search.ts 2024-05-04 13:13:15 -07:00
Nicolas
5229a4902b Update search.ts 2024-05-04 13:09:11 -07:00
Nicolas
ce7bab7b35 Update status.ts 2024-05-04 13:00:38 -07:00
Nicolas
15b774e974 Update index.ts 2024-05-04 12:44:30 -07:00
Nicolas
67f135a5b6 Update crawl-status.ts 2024-05-04 12:31:28 -07:00
Nicolas
2aa09a3000 Nick: partial docs working, cleaner 2024-05-04 12:30:12 -07:00
Nicolas
00373228fa Update index.ts 2024-05-04 11:53:16 -07:00
rafaelsideguide
fbb4c63a1a [Test] Added integration tests suite
solves #15
2024-05-03 17:23:25 -03:00
Nicolas
21cdaf5996
Update log_job.ts 2024-05-02 12:40:49 -07:00
Eric Ciarla
caf3f9eede Add Posthog Logging 2024-05-02 15:30:22 -04:00
Nicolas
8a95cb42f0 Update models.ts 2024-04-30 18:36:21 -07:00
Nicolas
4967536501 Update index.ts 2024-04-30 18:19:55 -07:00
Nicolas
768166b066 Update single_url.ts 2024-04-30 16:57:44 -07:00
Nicolas
a386259511 Update scrape.ts 2024-04-30 16:35:44 -07:00
Nicolas
dfcf39f4c0 Update scrape.ts 2024-04-30 16:19:59 -07:00
Nicolas
3c7030dbb1 Nick: improvements 2024-04-30 16:19:32 -07:00
Nicolas
cbd9e88b77 Merge branch 'main' into llm-extraction 2024-04-30 14:49:20 -07:00
Nicolas
4f526cff92 Nick: cleanup 2024-04-30 12:19:43 -07:00
Caleb Peffer
d9d206aff6 Caleb: 2024-04-30 10:27:39 -07:00
Caleb Peffer
d1235a0029 Caleb: switched back to markdown for extraction 2024-04-30 10:23:12 -07:00
Caleb Peffer
ad9c8e77d1 Caleb: commented out massive test 2024-04-30 10:22:09 -07:00
Caleb Peffer
a32f2b37b6 Caleb: logs work 2024-04-30 10:21:41 -07:00
Caleb Peffer
3ca9e5153f Caleb: trying to get loggin workng 2024-04-30 09:20:15 -07:00
rafaelsideguide
a095e1b63d Resolve merge conflicts with main 2024-04-30 10:54:18 -03:00
rafaelsideguide
35480bd2ad Update index.test.ts 2024-04-30 10:40:32 -03:00
rafaelsideguide
d3c36adaa7 Update index.ts 2024-04-29 17:58:47 -03:00
Caleb Peffer
79cd7d2ebc Merge branch 'llm-extraction' of https://github.com/mendableai/firecrawl into llm-extraction 2024-04-29 12:12:58 -07:00
Caleb Peffer
4f7737c922 Caleb: added ajv json schema validation. 2024-04-29 12:12:55 -07:00
rafaelsideguide
f8b207793f changed the request to do a HEAD to check for a PDF instead 2024-04-29 15:15:32 -03:00
Nicolas
b69feab916 Merge branch 'main' into llm-extraction 2024-04-29 08:40:44 -07:00
Rafael Miller
71bdbf9f15
Merge pull request #67 from mendableai/feat/python-sdk-502
[Feat] Implemented retry attempts to handle 502 errors
2024-04-29 08:38:19 -03:00
Caleb Peffer
667f740315 Caleb: converted llm response to json 2024-04-28 19:28:28 -07:00
Caleb Peffer
2ad7a58eb7 Caleb: first test passing 2024-04-28 17:38:20 -07:00