0
Commit Graph

316 Commits

Author SHA1 Message Date
rafaelsideguide
35927a65a5 Merge branch 'main' into feat/idempotency-key 2024-05-23 12:20:06 -03:00
rafaelsideguide
184e4678f1 bugfix on idempotency key check 2024-05-23 11:47:04 -03:00
rafaelsideguide
4dfc371241 Update index.test.ts 2024-05-22 14:38:41 -03:00
rafaelsideguide
f4a3469b9e Merge branch 'main' into bug/crawl-limit 2024-05-22 14:27:28 -03:00
Nicolas
0d187f0425
Merge pull request #77 from tractorjuice/patch-1
Add additional file extensions to crawler.ts
2024-05-22 10:16:49 -07:00
Nicolas
cb2bd0e71f Update index.test.ts 2024-05-21 19:03:32 -07:00
Nicolas
253abb849f Update rate-limiter.ts 2024-05-21 18:53:58 -07:00
Nicolas
229b9908d2 Nick: only enable hyper dx in prod 2024-05-21 18:52:46 -07:00
Nicolas
a8ff295977 Update single_url.ts 2024-05-21 18:50:42 -07:00
Nicolas
a5e718b084 Nick: improvements 2024-05-21 18:34:23 -07:00
Nicolas
6285f12cd1
Merge pull request #167 from mendableai/nsc/hyper-dx-integration
feat: HyperDX Integration
2024-05-21 13:19:38 -07:00
Nicolas
7f64fe884a Update blocklist.ts 2024-05-20 17:26:01 -07:00
Nicolas
756f54466d Nick: allowed keywords for now 2024-05-20 17:24:21 -07:00
Nicolas
01783dc336 Update openapi.json 2024-05-20 17:10:55 -07:00
Nicolas
77a79b5a79 Nick: max num tokens for llm extract (for now) + slice the max 2024-05-20 17:07:38 -07:00
Nicolas
2644e1c029 Update .env.example 2024-05-20 13:36:51 -07:00
Nicolas
9e61d431f0 Nick: hyper dx integration init 2024-05-20 13:36:34 -07:00
Nicolas
c74f757b53 Update rate-limiter.ts 2024-05-19 13:05:36 -07:00
Nicolas
98a39b39ab Nick: increased rate limits 2024-05-19 12:59:29 -07:00
Nicolas
18fa15df25 Update index.test.ts 2024-05-19 12:50:06 -07:00
Nicolas
614c073af0 Nick: improvements 2024-05-19 12:45:46 -07:00
Nicolas
f473793ba3 Merge branch 'main' into feat/rate-limits 2024-05-19 12:23:34 -07:00
rafaelsideguide
a480595aa7 Update index.test.ts 2024-05-17 15:41:27 -03:00
rafaelsideguide
54049be539 Added e2e tests 2024-05-17 15:37:47 -03:00
Nicolas
6feb21cc35 Update website_params.ts 2024-05-17 11:21:26 -07:00
Nicolas
5be208f595 Nick: fixed 2024-05-17 10:40:44 -07:00
Nicolas
eb88447e8b Update index.test.ts 2024-05-17 10:00:05 -07:00
Nicolas
df6c3d1e7d Merge branch 'main' into detect-pdfs 2024-05-17 09:55:51 -07:00
Nicolas
9d635cb2a3 Nick: docx support 2024-05-16 11:48:02 -07:00
Nicolas
bcce0544e7 Update openapi.json 2024-05-16 11:03:32 -07:00
Nicolas
80250fb54f Update index.test.ts 2024-05-15 17:40:46 -07:00
Nicolas
098db17913 Update index.ts 2024-05-15 17:37:09 -07:00
Nicolas
93b1f0334e Update index.test.ts 2024-05-15 17:35:06 -07:00
Nicolas
123fb784ca Update index.test.ts 2024-05-15 17:29:22 -07:00
Nicolas
4a6cfb6097 Update index.test.ts 2024-05-15 17:22:29 -07:00
Nicolas
6ca368327f Merge branch 'main' into test/crawl-options 2024-05-15 17:18:25 -07:00
Nicolas
24be4866c5 Nick: 2024-05-15 17:16:20 -07:00
Nicolas
ade4e05cff Nick: working 2024-05-15 17:13:04 -07:00
Nicolas
bfccaf670d Nick: fixes most of it 2024-05-15 15:30:37 -07:00
rafaelsideguide
d91043376c not working yet 2024-05-15 18:54:40 -03:00
rafaelsideguide
fa014defc7 Fixing child links only bug 2024-05-15 18:35:09 -03:00
Nicolas
2ba743fb1a
Merge pull request #27 from eltociear/patch-1
refactor: fix typo in WebScraper/index.ts
2024-05-15 13:28:38 -07:00
Nicolas
0663d78324
Merge pull request #119 from chand1012/main
Add Docker Compose for easy self hosting
2024-05-15 13:27:40 -07:00
Nicolas
58053eb423 Update rate-limiter.ts 2024-05-15 12:47:35 -07:00
Nicolas
1601e93d69 Merge branch 'main' into test/crawl-options 2024-05-15 12:34:47 -07:00
Nicolas
3678d3c986 Merge branch 'main' of https://github.com/mendableai/firecrawl 2024-05-15 12:11:18 -07:00
Nicolas
fd82982a31 Nick: 2024-05-15 12:11:16 -07:00
rafaelsideguide
4925ee59f6 added crawl test suite 2024-05-15 15:50:50 -03:00
Nicolas
1b0d6341d3 Update index.ts 2024-05-15 11:48:12 -07:00
Nicolas
d10f81e7fe Nick: fixes 2024-05-15 11:28:20 -07:00
Nicolas
87570bdfa1 Update index.ts 2024-05-15 11:06:03 -07:00
rafaelsideguide
d4574851be Added rpc definition 2024-05-15 08:40:21 -03:00
rafaelsideguide
47c20c80ab Update auth.ts 2024-05-15 08:34:49 -03:00
Ikko Eltociear Ashimine
e91c122c69
Merge branch 'main' into patch-1 2024-05-15 12:14:52 +09:00
Nicolas
7d8ceab6de Merge branch 'feat/rate-limits' of https://github.com/mendableai/firecrawl into feat/rate-limits 2024-05-14 14:48:01 -07:00
Nicolas
0e0faa28b3 Update auth.ts 2024-05-14 14:47:36 -07:00
rafaelsideguide
672eddb999 updated rpc 2024-05-14 18:47:21 -03:00
Nicolas
4761ea510b Update rate-limiter.ts 2024-05-14 14:26:42 -07:00
rafaelsideguide
40ad97dee8 added rate limits 2024-05-14 18:08:31 -03:00
Nicolas
27e1e22a0a Update index.test.ts 2024-05-14 12:28:25 -07:00
Nicolas
a0fdc6f7c6 Nick: 2024-05-14 12:12:40 -07:00
Nicolas
7f31959be7 Nick: 2024-05-14 12:04:36 -07:00
Nicolas
8a72cf556b Nick: 2024-05-13 21:10:58 -07:00
Nicolas
26a092f780 Update index.ts 2024-05-13 21:04:49 -07:00
Nicolas
8101cbee37 Update index.ts 2024-05-13 21:02:47 -07:00
Nicolas
86b8439844 Nick: 2024-05-13 20:51:42 -07:00
Nicolas
a96fc5b96d Nick: 4x speed 2024-05-13 20:45:11 -07:00
Nicolas
bd27b0e17e
Merge pull request #142 from mendableai/doc/crawl-limit-default
[Doc] Added default value for crawlOptions.limit
2024-05-13 18:38:09 -07:00
Nicolas
999176d576 Merge branch 'main' of https://github.com/mendableai/firecrawl 2024-05-13 13:57:34 -07:00
Nicolas
f3ec21d9c4 Update runWebScraper.ts 2024-05-13 13:57:22 -07:00
Nicolas
65d89afba9 Nick: 2024-05-13 13:01:43 -07:00
Eric Ciarla
4cc46d4af8 Update models.ts 2024-05-13 15:23:31 -04:00
rafaelsideguide
8eb2e95f19 Cleaned up 2024-05-13 16:13:10 -03:00
Nicolas
2ce045912f Nick: disable vision right now 2024-05-13 10:56:08 -07:00
rafaelsideguide
f4348024c6 Added check during scraping to deal with pdfs
Checks if the URL is a PDF during the scraping process (single_url.ts).

TODO: Run integration tests - Does this strat affect the running time?

ps. Some comments need to be removed if we decide to proceed with this strategy.
2024-05-13 09:13:42 -03:00
Rafael Miller
5a2712fa5a
Merge branch 'main' into detect-pdfs 2024-05-10 15:53:13 -03:00
rafaelsideguide
bc6b929b43 [Bug] Fixing /crawl limit 2024-05-10 12:15:54 -03:00
rafaelsideguide
df16890f84 Added default value for crawlOptions.limit 2024-05-10 11:59:33 -03:00
rafaelsideguide
18480b2005 Removed .env.example, improved docs and docker compose envs 2024-05-10 11:38:17 -03:00
Nicolas
66bd1e4020 Update website_params.ts 2024-05-09 18:41:15 -07:00
Nicolas
d21091bb06 Update single_url.ts 2024-05-09 17:52:46 -07:00
Nicolas
be85008622 Nick: better 2024-05-09 17:48:11 -07:00
Nicolas
be5661a768 Nick: a lot better 2024-05-09 17:45:16 -07:00
Nicolas
fce17e6beb Update credit_billing.ts 2024-05-09 15:29:58 -07:00
Nicolas
9541ff6b30 Nick: 429 addressed 2024-05-08 15:14:39 -07:00
Nicolas
4a5f87623c
Merge pull request #118 from mendableai/feat/test-suite
[Test] Added integration tests suite
2024-05-08 12:47:17 -07:00
Nicolas
b7e3104c7b Ni 2024-05-08 12:18:53 -07:00
rafaelsideguide
3f460af6c5 Added idempotency key to crawl route 2024-05-07 15:29:27 -03:00
Eric Ciarla
d280bcadf3 Add keyAuth 2024-05-07 13:52:42 -04:00
Nicolas
dcedb8d798 Merge branch 'main' into feat/max-depth 2024-05-07 10:20:49 -07:00
Nicolas
6505bf6bf2 Merge branch 'main' into feat/max-depth 2024-05-07 10:20:44 -07:00
Nicolas
bdbee963f7 Merge branch 'main' into nsc/cancel-job 2024-05-07 10:13:43 -07:00
rafaelsideguide
61d615c04b Added tests 2024-05-07 14:03:00 -03:00
rafaelsideguide
e1f52c538f nested includeHtml inside pageOptions 2024-05-07 13:40:24 -03:00
Nicolas
f46bf19fa5 Nick: 2024-05-07 09:26:52 -07:00
rafaelsideguide
83f3408634 Added max depth option 2024-05-07 11:06:26 -03:00
Nicolas
2e3ff85509 Update crawl-cancel.ts 2024-05-06 17:22:16 -07:00
Nicolas
6d5da358cc Nick: cancel job 2024-05-06 17:16:43 -07:00
rafaelsideguide
509250c4ef changed to includeHtml 2024-05-06 19:45:56 -03:00
rafaelsideguide
538355f1af Added toMarkdown option 2024-05-06 11:36:44 -03:00