Nicolas
b4c6819a54
Nick:
2024-06-05 11:11:09 -07:00
rafaelsideguide
0d51b11dcd
missing breaks
2024-06-05 15:02:28 -03:00
Nicolas
beb7526d1d
Update webhook.ts
2024-06-05 10:38:05 -07:00
Nicolas
1a16378fe8
Merge pull request #234 from JakobStadlhuber/feat/webhook-self-hosted
...
Add support for Self-Hosted Webhook URL Usage and added project_id into the webhook payload
2024-06-05 10:25:05 -07:00
Nicolas
7cb14edec8
Nick:
2024-06-05 10:13:52 -07:00
Rafael Miller
9e000ded03
Merge branch 'main' into feat/better-gdrive-pdf-fetch
2024-06-05 14:07:56 -03:00
rafaelsideguide
ccc55127d6
Added scroll xpaths on fire-engine for handling readme docs
2024-06-05 11:48:41 -03:00
rafaelsideguide
b5045d1661
[feat] improved the scrape for gdrive pdfs
2024-06-04 17:47:28 -03:00
Nicolas
96257b7b17
Update handleCustomScraping.ts
2024-06-04 12:22:46 -07:00
Nicolas
674500affa
Nick:
2024-06-04 12:15:39 -07:00
rafaelsideguide
5ae4d1caf5
Update single_url.ts
2024-06-04 15:28:09 -03:00
Jakob Stadlhuber
6208f4207d
Add support for Self-Hosted Webhook URL Usage and added project_id into the webhook payload
...
This commit introduces the capability of using a Self-Hosted Webhook URL. The application now checks for a self-hosted URL before querying the database for the webhook settings. If a Self-Hosted Webhook URL is set in the environment variables, it will be used directly, diminishing unnecessary database queries.
2024-06-04 19:55:07 +02:00
rafaelsideguide
64a4338ff0
Update single_url.ts
2024-06-04 14:40:05 -03:00
Rafael Miller
02fe470e20
Merge pull request #148 from mendableai/nsc/improvemnts-fixes-misc
...
Better fallbacks for initial crawl start
2024-06-04 14:31:10 -03:00
Rafael Miller
b80fb374e5
Merge branch 'main' into playwright-service-bug-222
2024-06-04 11:57:17 -03:00
rafaelsideguide
6920ec8a61
bugfixing. already on main
2024-06-04 11:05:50 -03:00
Nicolas
cbf8d79cce
Update pdfProcessor.ts
2024-06-04 00:13:37 -07:00
Nicolas
2ea01f1456
Update single_url.ts
2024-06-03 23:42:39 -07:00
Nicolas
854d5b3cb3
Update single_url.ts
2024-06-03 23:32:55 -07:00
Nicolas
99059814a8
Nick:
2024-06-03 21:32:48 -07:00
Nicolas
918059ee9e
Merge branch 'main' into nsc/improvemnts-fixes-misc
2024-06-03 16:46:02 -07:00
Nicolas
38e583f66c
Update socialBlockList.test.ts
2024-06-03 16:44:23 -07:00
Nicolas
c69c89f838
Nick:
2024-06-03 16:42:42 -07:00
Nicolas
48d1ec05b2
Merge branch 'main' into nsc/improved-blocklist
2024-06-03 16:38:03 -07:00
Nicolas
d30ced4394
Merge pull request #221 from mendableai/nsc/fwd-header-auth
...
feat: Ability to forward headers to reliable providers for auth etc...
2024-06-03 16:33:40 -07:00
rafaelsideguide
c1aed1360e
Update index.test.ts
2024-06-03 15:51:07 -03:00
rafaelsideguide
1fc3a15149
Update single_url.ts
2024-06-03 15:24:40 -03:00
Nicolas
fde522c3e1
Update single_url.ts
2024-06-02 20:23:45 -07:00
Matt Joyce
deefe65cbe
Change the way the playwright response is parsed
...
Was failing with a Type Error, but actually looked ok.
This fixes the type error, and stop scraper fallback.
2024-06-01 19:16:56 +10:00
Nicolas
8cb62dde92
Update website_params.ts
2024-05-31 16:09:39 -07:00
Nicolas
3b8059edb6
Update single_url.ts
2024-05-31 15:43:06 -07:00
Nicolas
6bea803120
Nick:
2024-05-31 15:39:54 -07:00
Nicolas
2139129296
Nick: v12
2024-05-31 11:39:55 -07:00
Nicolas
260e31c68b
Merge branch 'nsc/new-pricing'
2024-05-30 16:08:31 -07:00
Nicolas
aa8133ca7f
Update load-testing-example.ts
2024-05-30 16:07:14 -07:00
Nicolas
0c115c6181
Merge pull request #216 from mendableai/nsc/new-pricing
...
feat: New pricing/limits changes
2024-05-30 15:36:59 -07:00
Nicolas
6860ace4af
Nick:
2024-05-30 15:07:49 -07:00
Nicolas
6ceb7ff50a
Nick:
2024-05-30 14:46:55 -07:00
Nicolas
33f10a7f91
Nick: fixes
2024-05-30 14:42:32 -07:00
Nicolas
ace46f340b
Nick: new limits, new pricing
2024-05-30 14:31:36 -07:00
Nicolas
6c939d534d
Nick: small refactor
2024-05-29 19:43:51 -07:00
Eric Ciarla
37915e11e8
Final push
2024-05-29 21:18:24 -04:00
Eric Ciarla
a0e404f94e
init commit
2024-05-29 18:56:57 -04:00
rafaelsideguide
ee9a2184e2
Added custom scraping conditions for readme docs
2024-05-29 13:39:43 -03:00
Nicolas
c20c38721d
Update index.test.ts
2024-05-28 17:17:20 -07:00
Nicolas
0f43a12906
Update index.test.ts
2024-05-28 17:17:12 -07:00
Nicolas
1b3547dcf2
Nick:
2024-05-28 12:56:24 -07:00
Nicolas
1bbfb98d7e
Merge pull request #186 from Keredu/main
...
Limit on /search is not deterministic
2024-05-26 18:08:16 -07:00
Nicolas
7e2df7bd5e
Update auth.ts
2024-05-26 18:07:21 -07:00
Simon H
115204e6b6
Feat: Provide more details for 429 error msg
...
- Added better error code for when rate limit exceeded including
consumed/remaining points, reset date and retry-after seconds
2024-05-25 12:03:20 -04:00
Keredu
2192978f91
Limit on /search is not deterministic
2024-05-25 00:12:26 +02:00
Nicolas
e98434606d
Update blocklist.ts
2024-05-24 15:04:15 -07:00
Nicolas
e5c8719554
Update blocklist.ts
2024-05-24 14:53:04 -07:00
rafaelsideguide
d39860c08b
Merge branch 'main' into feat/idempotency-key
2024-05-24 14:15:37 -03:00
Nicolas
53a214cefb
Merge pull request #168 from mendableai/nsc/allowed-keywords-in-blocklist
...
feat: Allow privacy/legal/ other pages in social media websites
2024-05-24 09:43:15 -07:00
rafaelsideguide
35927a65a5
Merge branch 'main' into feat/idempotency-key
2024-05-23 12:20:06 -03:00
rafaelsideguide
184e4678f1
bugfix on idempotency key check
2024-05-23 11:47:04 -03:00
rafaelsideguide
4dfc371241
Update index.test.ts
2024-05-22 14:38:41 -03:00
rafaelsideguide
f4a3469b9e
Merge branch 'main' into bug/crawl-limit
2024-05-22 14:27:28 -03:00
Nicolas
0d187f0425
Merge pull request #77 from tractorjuice/patch-1
...
Add additional file extensions to crawler.ts
2024-05-22 10:16:49 -07:00
Nicolas
cb2bd0e71f
Update index.test.ts
2024-05-21 19:03:32 -07:00
Nicolas
253abb849f
Update rate-limiter.ts
2024-05-21 18:53:58 -07:00
Nicolas
229b9908d2
Nick: only enable hyper dx in prod
2024-05-21 18:52:46 -07:00
Nicolas
a8ff295977
Update single_url.ts
2024-05-21 18:50:42 -07:00
Nicolas
a5e718b084
Nick: improvements
2024-05-21 18:34:23 -07:00
Nicolas
6285f12cd1
Merge pull request #167 from mendableai/nsc/hyper-dx-integration
...
feat: HyperDX Integration
2024-05-21 13:19:38 -07:00
Nicolas
7f64fe884a
Update blocklist.ts
2024-05-20 17:26:01 -07:00
Nicolas
756f54466d
Nick: allowed keywords for now
2024-05-20 17:24:21 -07:00
Nicolas
77a79b5a79
Nick: max num tokens for llm extract (for now) + slice the max
2024-05-20 17:07:38 -07:00
Nicolas
9e61d431f0
Nick: hyper dx integration init
2024-05-20 13:36:34 -07:00
Nicolas
c74f757b53
Update rate-limiter.ts
2024-05-19 13:05:36 -07:00
Nicolas
98a39b39ab
Nick: increased rate limits
2024-05-19 12:59:29 -07:00
Nicolas
18fa15df25
Update index.test.ts
2024-05-19 12:50:06 -07:00
Nicolas
614c073af0
Nick: improvements
2024-05-19 12:45:46 -07:00
Nicolas
f473793ba3
Merge branch 'main' into feat/rate-limits
2024-05-19 12:23:34 -07:00
rafaelsideguide
a480595aa7
Update index.test.ts
2024-05-17 15:41:27 -03:00
rafaelsideguide
54049be539
Added e2e tests
2024-05-17 15:37:47 -03:00
Nicolas
6feb21cc35
Update website_params.ts
2024-05-17 11:21:26 -07:00
Nicolas
5be208f595
Nick: fixed
2024-05-17 10:40:44 -07:00
Nicolas
eb88447e8b
Update index.test.ts
2024-05-17 10:00:05 -07:00
Nicolas
df6c3d1e7d
Merge branch 'main' into detect-pdfs
2024-05-17 09:55:51 -07:00
Nicolas
9d635cb2a3
Nick: docx support
2024-05-16 11:48:02 -07:00
Nicolas
80250fb54f
Update index.test.ts
2024-05-15 17:40:46 -07:00
Nicolas
098db17913
Update index.ts
2024-05-15 17:37:09 -07:00
Nicolas
93b1f0334e
Update index.test.ts
2024-05-15 17:35:06 -07:00
Nicolas
123fb784ca
Update index.test.ts
2024-05-15 17:29:22 -07:00
Nicolas
4a6cfb6097
Update index.test.ts
2024-05-15 17:22:29 -07:00
Nicolas
6ca368327f
Merge branch 'main' into test/crawl-options
2024-05-15 17:18:25 -07:00
Nicolas
24be4866c5
Nick:
2024-05-15 17:16:20 -07:00
Nicolas
ade4e05cff
Nick: working
2024-05-15 17:13:04 -07:00
Nicolas
bfccaf670d
Nick: fixes most of it
2024-05-15 15:30:37 -07:00
rafaelsideguide
d91043376c
not working yet
2024-05-15 18:54:40 -03:00
rafaelsideguide
fa014defc7
Fixing child links only bug
2024-05-15 18:35:09 -03:00
Nicolas
2ba743fb1a
Merge pull request #27 from eltociear/patch-1
...
refactor: fix typo in WebScraper/index.ts
2024-05-15 13:28:38 -07:00
Nicolas
58053eb423
Update rate-limiter.ts
2024-05-15 12:47:35 -07:00
Nicolas
1601e93d69
Merge branch 'main' into test/crawl-options
2024-05-15 12:34:47 -07:00
rafaelsideguide
4925ee59f6
added crawl test suite
2024-05-15 15:50:50 -03:00
Nicolas
1b0d6341d3
Update index.ts
2024-05-15 11:48:12 -07:00
Nicolas
d10f81e7fe
Nick: fixes
2024-05-15 11:28:20 -07:00
Nicolas
87570bdfa1
Update index.ts
2024-05-15 11:06:03 -07:00