rafaelsideguide
8eb2e95f19
Cleaned up
2024-05-13 16:13:10 -03:00
Nicolas
2ce045912f
Nick: disable vision right now
2024-05-13 10:56:08 -07:00
rafaelsideguide
f4348024c6
Added check during scraping to deal with pdfs
...
Checks if the URL is a PDF during the scraping process (single_url.ts).
TODO: Run integration tests - Does this strat affect the running time?
ps. Some comments need to be removed if we decide to proceed with this strategy.
2024-05-13 09:13:42 -03:00
Rafael Miller
5a2712fa5a
Merge branch 'main' into detect-pdfs
2024-05-10 15:53:13 -03:00
rafaelsideguide
df16890f84
Added default value for crawlOptions.limit
2024-05-10 11:59:33 -03:00
rafaelsideguide
18480b2005
Removed .env.example, improved docs and docker compose envs
2024-05-10 11:38:17 -03:00
Nicolas
66bd1e4020
Update website_params.ts
2024-05-09 18:41:15 -07:00
Nicolas
d21091bb06
Update single_url.ts
2024-05-09 17:52:46 -07:00
Nicolas
be85008622
Nick: better
2024-05-09 17:48:11 -07:00
Nicolas
be5661a768
Nick: a lot better
2024-05-09 17:45:16 -07:00
Nicolas
fce17e6beb
Update credit_billing.ts
2024-05-09 15:29:58 -07:00
Nicolas
9541ff6b30
Nick: 429 addressed
2024-05-08 15:14:39 -07:00
Nicolas
4a5f87623c
Merge pull request #118 from mendableai/feat/test-suite
...
[Test] Added integration tests suite
2024-05-08 12:47:17 -07:00
Nicolas
b7e3104c7b
Ni
2024-05-08 12:18:53 -07:00
Eric Ciarla
d280bcadf3
Add keyAuth
2024-05-07 13:52:42 -04:00
Nicolas
dcedb8d798
Merge branch 'main' into feat/max-depth
2024-05-07 10:20:49 -07:00
Nicolas
6505bf6bf2
Merge branch 'main' into feat/max-depth
2024-05-07 10:20:44 -07:00
Nicolas
bdbee963f7
Merge branch 'main' into nsc/cancel-job
2024-05-07 10:13:43 -07:00
rafaelsideguide
61d615c04b
Added tests
2024-05-07 14:03:00 -03:00
rafaelsideguide
e1f52c538f
nested includeHtml inside pageOptions
2024-05-07 13:40:24 -03:00
Nicolas
f46bf19fa5
Nick:
2024-05-07 09:26:52 -07:00
rafaelsideguide
83f3408634
Added max depth option
2024-05-07 11:06:26 -03:00
Nicolas
2e3ff85509
Update crawl-cancel.ts
2024-05-06 17:22:16 -07:00
Nicolas
6d5da358cc
Nick: cancel job
2024-05-06 17:16:43 -07:00
rafaelsideguide
509250c4ef
changed to includeHtml
2024-05-06 19:45:56 -03:00
rafaelsideguide
538355f1af
Added toMarkdown option
2024-05-06 11:36:44 -03:00
Nicolas
d1b6f6dcde
Update fly.toml
2024-05-04 13:49:09 -07:00
Nicolas
cd9a0840b5
Update search.ts
2024-05-04 13:13:15 -07:00
Nicolas
5229a4902b
Update search.ts
2024-05-04 13:09:11 -07:00
Nicolas
ce7bab7b35
Update status.ts
2024-05-04 13:00:38 -07:00
Nicolas
15b774e974
Update index.ts
2024-05-04 12:44:30 -07:00
Nicolas
67f135a5b6
Update crawl-status.ts
2024-05-04 12:31:28 -07:00
Nicolas
2aa09a3000
Nick: partial docs working, cleaner
2024-05-04 12:30:12 -07:00
Nicolas
00373228fa
Update index.ts
2024-05-04 11:53:16 -07:00
Nicolas
21cdaf5996
Update log_job.ts
2024-05-02 12:40:49 -07:00
Eric Ciarla
caf3f9eede
Add Posthog Logging
2024-05-02 15:30:22 -04:00
Nicolas
8a95cb42f0
Update models.ts
2024-04-30 18:36:21 -07:00
Nicolas
4967536501
Update index.ts
2024-04-30 18:19:55 -07:00
Nicolas
768166b066
Update single_url.ts
2024-04-30 16:57:44 -07:00
Nicolas
a386259511
Update scrape.ts
2024-04-30 16:35:44 -07:00
Nicolas
dfcf39f4c0
Update scrape.ts
2024-04-30 16:19:59 -07:00
Nicolas
3c7030dbb1
Nick: improvements
2024-04-30 16:19:32 -07:00
Nicolas
cbd9e88b77
Merge branch 'main' into llm-extraction
2024-04-30 14:49:20 -07:00
Nicolas
4f526cff92
Nick: cleanup
2024-04-30 12:19:43 -07:00
Caleb Peffer
d9d206aff6
Caleb:
2024-04-30 10:27:39 -07:00
Caleb Peffer
d1235a0029
Caleb: switched back to markdown for extraction
2024-04-30 10:23:12 -07:00
Caleb Peffer
ad9c8e77d1
Caleb: commented out massive test
2024-04-30 10:22:09 -07:00
Caleb Peffer
a32f2b37b6
Caleb: logs work
2024-04-30 10:21:41 -07:00
Caleb Peffer
3ca9e5153f
Caleb: trying to get loggin workng
2024-04-30 09:20:15 -07:00
rafaelsideguide
a095e1b63d
Resolve merge conflicts with main
2024-04-30 10:54:18 -03:00
rafaelsideguide
35480bd2ad
Update index.test.ts
2024-04-30 10:40:32 -03:00
rafaelsideguide
d3c36adaa7
Update index.ts
2024-04-29 17:58:47 -03:00
Caleb Peffer
79cd7d2ebc
Merge branch 'llm-extraction' of https://github.com/mendableai/firecrawl into llm-extraction
2024-04-29 12:12:58 -07:00
Caleb Peffer
4f7737c922
Caleb: added ajv json schema validation.
2024-04-29 12:12:55 -07:00
rafaelsideguide
f8b207793f
changed the request to do a HEAD to check for a PDF instead
2024-04-29 15:15:32 -03:00
Nicolas
b69feab916
Merge branch 'main' into llm-extraction
2024-04-29 08:40:44 -07:00
Caleb Peffer
667f740315
Caleb: converted llm response to json
2024-04-28 19:28:28 -07:00
Caleb Peffer
2ad7a58eb7
Caleb: first test passing
2024-04-28 17:38:20 -07:00
Caleb Peffer
06497729e2
Caleb: got it to a testable state I believe
2024-04-28 15:52:09 -07:00
Caleb Peffer
6ee1f2d3bc
Caleb: initially pulled inspiration code from https://github.com/mishushakov/llm-scraper
2024-04-28 13:59:35 -07:00
Nicolas
68838c9e0d
Update single_url.ts
2024-04-28 12:44:00 -07:00
Nicolas
d8ee4e90d6
Update website_params.ts
2024-04-28 11:47:25 -07:00
Nicolas
8e44696c4d
Nick:
2024-04-28 11:34:25 -07:00
Nicolas
7689c31d35
Update credit_billing.ts
2024-04-26 14:36:19 -07:00
Nicolas
0a607b9efa
Merge branch 'main' into feat/coupons
2024-04-26 14:23:35 -07:00
Nicolas
fdf913e0f1
Update index.test.ts
2024-04-26 13:06:48 -07:00
Nicolas
8e32453424
Update auth.ts
2024-04-26 12:57:49 -07:00
rafaelsideguide
1f48998970
done
2024-04-26 16:27:31 -03:00
Nicolas
d210a57a9b
Update credit_billing.ts
2024-04-26 10:24:36 -07:00
Nicolas
24e1bdec1b
Update credit_billing.ts
2024-04-26 10:14:29 -07:00
rafaelsideguide
06675d1fe3
almost finished
2024-04-26 11:42:49 -03:00
Nicolas
3ac8724329
Update openapi.json
2024-04-25 13:28:07 -07:00
Nicolas
a3911bfc67
Update index.ts
2024-04-25 10:00:35 -07:00
rafaelsideguide
9c481e5e83
[Feat] Coupon system
...
WIP. Idea for solving #57
2024-04-25 10:05:53 -03:00
rafaelsideguide
75597f72a1
[Feat] Added allowed urls
...
FireCrawl should be able to scrape LinkedIn Articles (/pulse/*)
2024-04-25 08:39:45 -03:00
Nicolas
a59ddf1855
Nick: default to serper
2024-04-24 18:00:25 -07:00
Roger M
f2690f6909
Support for tbs, filter, lang, country and location with Serper search.
2024-04-25 01:35:17 +01:00
Nicolas
e7d385ad32
Update search.ts
2024-04-24 10:23:26 -07:00
Nicolas
877af4231b
Update openapi.json
2024-04-24 10:11:44 -07:00
Nicolas
307ea6f5ec
Nick: improvements to search
2024-04-24 10:11:01 -07:00
Rafael Miller
f189589da4
Merge pull request #34 from mendableai/nsc/returnOnlyUrls
...
Implements the ability for the crawler to output all the links it found, without scraping
2024-04-24 10:34:42 -03:00
rafaelsideguide
07e93ee5fd
Update requests.http
2024-04-24 10:32:35 -03:00
rafaelsideguide
942ac3b41c
Resolved merge conflicts between feat/added-anthropic-vision-api and main
2024-04-24 09:57:45 -03:00
Nicolas
3b5b868d0d
Update requests.http
2024-04-23 18:13:58 -07:00
Nicolas
8939ca570b
Merge branch 'main' into nsc/returnOnlyUrls
2024-04-23 18:05:48 -07:00
Nicolas
479fa2f7f8
Nick:
2024-04-23 17:46:32 -07:00
Nicolas
fdb2789eaa
Nick: added url as return param
2024-04-23 17:14:34 -07:00
Nicolas
3abfd6b4c1
Update search.ts
2024-04-23 17:06:48 -07:00
Nicolas
53cc4c396f
Update search.ts
2024-04-23 17:05:58 -07:00
Nicolas
734c76fc56
Merge branch 'main' into nsc/mvp-search
2024-04-23 17:04:31 -07:00
Nicolas
f0695c7123
Update single_url.ts
2024-04-23 17:04:10 -07:00
Nicolas
4328a68ec1
Nick:
2024-04-23 16:57:53 -07:00
Nicolas
e6779aff68
Nick: tests
2024-04-23 16:56:09 -07:00
Nicolas
9ded75adb7
Merge branch 'main' into nsc/mvp-search
2024-04-23 16:52:40 -07:00
Nicolas
f3c190c21c
Nick:
2024-04-23 16:47:24 -07:00
Nicolas
41263bb4b6
Nick: serper support
2024-04-23 16:45:06 -07:00
Nicolas
8cb5d7955a
Update googlesearch.ts
2024-04-23 15:49:05 -07:00
Nicolas
495adc9a3f
Update googlesearch.ts
2024-04-23 15:48:37 -07:00
Nicolas
5e3e2ec966
Nick:
2024-04-23 15:44:11 -07:00
Nicolas
0146157876
Nick: mvp
2024-04-23 15:28:32 -07:00