TestDanRun @testdanrun

Coding all day long Joined October 2019

Tweets

43
Followers

6
Following

97
Likes

485

TestDanRun @testdanrun

5 months ago

@shinjiw_at_cmu Exciting! What will the next bound of OWSM be?

1 0 1 120 0

View Details

Tom Dörr @tom_doerr

6 months ago

Transcribes and summarizes meetings locally github.com/Zackriya-Solut…

2 22 191 9K 202

View Details

Tom Dörr @tom_doerr

6 months ago

Transcribes audio into notes on your infrastructure github.com/murtaza-nasir/…

2 18 115 6K 121

View Details

@alphacep I had pretty much the same intuition as well. I think FAIR's approach with the Omnilingual models lead to great performance for REALLY low-resource languages, but not so much for pushing performance for high-resource and even mid-resource languages.

0 0 1 281 0

View Details

TestDanRun @testdanrun

10 months ago

@unilightwf What does this mean? So they tested to see if evaluators would mistake the human voice as synthetically generated? And out of all the tests, only 78.33% of the human voices in LJSpeech were identified as real voices?

1 0 1 71 0

View Details

TestDanRun @testdanrun

a year ago

@jiatongshi Awesomes!!

0 0 1 81 0

View Details

TestDanRun @testdanrun

a year ago

@huckiyang @icmlconf @chenwanch1 @MXzBFhjFpS1jyMI @shinjiw_at_cmu @WavLab @NVIDIAAI That sounds interesting! I'm also interested in finding out what's a good distribution of data across languages to train a good multilingual ASR/AST model!

0 0 1 47 0

View Details

TestDanRun @testdanrun

a year ago

@chenwanch1 @LTIatCMU @nvidia Congratulations!! Can't wait to try the models out as a fellow ASR and Rowlet enthusiast!!

0 0 0 44 0

View Details

Alibaba_SpeechAI @TONGYI_SpeechAI

a year ago

🎵 Introducing InspireMusic – an open-source music generation toolkit from Tongyi Lab, designed as an all-in-one AIGC toolkit for music, song, and audio creation. Whether you're a researcher, developer, or music enthusiast, InspireMusic has something for you: For researchers and developers: Train and fine-tune music/song/audio generation models with ease, optimizing the creative output. For music lovers: An intuitive tool to create music, songs, or audio content using text descriptions or audio prompts. 🚀 What's special about InspireMusic? ·Unified Audio Generation Framework: Powered by advanced generative model technology, InspireMusic supports music, song, and audio generation, offering diverse possibilities. ·Flexible and Controllable Output: Generate music with precise style and structure by using text prompts and musical feature descriptions. ·Simple and User-Friendly: Streamlined tools for model fine-tuning and inference, ensuring efficient training and improvements. ✨ Try it out now! 🎵 GitHub Repository: github.com/FunAudioLLM/In…🎶 Online Experience: 🤗HuggingFace Spaces: huggingface.co/spaces/FunAudi… ♪ Demo Page: iris2c.github.io/InspireMusic Start creating your own musical masterpiece today! 🎶

8 64 261 17K 206

View Details

TestDanRun @testdanrun

a year ago

@mhnt1580 Awesome! I am still wrapping my head around the "content usefulness" axis though. For sound especially, would it be right to say if there are enough to sounds to form a scene, it would be a useful content?

1 0 0 55 0

View Details

TestDanRun @testdanrun

2 years ago

@alphacep I'm wondering, given how performant Whisper is, are there still substantial benefits to pre-train and finetune your own self-supervised model, or would we get better results just from finetuning Whisper?

1 0 0 35 0

View Details

TestDanRun @testdanrun

2 years ago

#INTERCATSPEECH2024

0 1 2 269 0

View Details

TestDanRun @testdanrun

2 years ago

@unilightwf Hello! I'm interested in building speech evaluation frameworks! Could you provide some details? :)

0 0 0 36 0

View Details

RWKV @RWKV_AI

2 years ago

Introducing Eagle-7B Based on the RWKV-v5 architecture, bringing into opensource space, the strongest - multi-lingual model (beating even mistral) - attention-free transformer today (10-100x+ lower inference) With comparable English performance with the best 1T 7B models

18 233 1K 353K 707

View Details

Berrak Sisman @berraksismann

3 years ago

We are organizing The Speaker and Language Recognition Workshop (Odyssey) 2024, which will be held in Canada. The theme, "No Speaker Left Behind", underscores our commitment to overcoming disparities that affect individuals with diverse accents, backgrounds, or speech variations.

1 11 28 3K 5

View Details

Yoach @yoachlacombe

3 years ago

Excited to share you can now finetune over 1100+ TTS models thanks to @AIatMeta's MMS and the library shared below! In my experiments, you can get an excellent finetuned version of every MMS checkpoint takes just 20 minutes, with as few as 80 to 150 samples, across all models.

3 16 68 18K 62

View Details

Hrishi @hrishioa

3 years ago

A week ago one of our customers handed us 1000 pages of this (10,000 more to come), and asked us for RAG solution. We said yes - because we said yes before we saw the document. But we've solved it - and there's a chance it's a strong improvement on all RAG SoTA.

52 138 2K 455K 2K

View Details

TestDanRun @testdanrun

3 years ago

@lileics Ahh, I wasn't able to make it. Would you be sharing the slides?

0 0 0 52 0

View Details

Gallil Maimon @GallilMaimon

3 years ago

Happy to share our paper "Speaking Style Conversion with Discrete Self-Supervised Units" got accepted to #EMNLP2023 🌟 Project page - pages.cs.huji.ac.il/adiyoss-lab/di… W/ @adiyossLC (1/n) 🧵👇