> Try running any tidyverse code after more than 6 months, it's almost guaranteed to be broken.
You are entitled to your opinion (which I see in every thread that discusses the tidyverse), but in my opinion this is a considerable and outdated exaggeration which will mislead the less experienced. Let's balance it out with some different perspective.
1. The core of the tidyverse, the data manipulation package dplyr, reached version 1.0 in May 2020 and made a promise to keep the API stable from that point on. To the best of my knowledge, they have done so, and anyone who wants to verify this can look at the changelogs. That's nearly 3 years of stability.
2. For several years, functions in the tidyverse have had a lifecycle indicator that appears in the documentation. It tells you if the function has reached a mature state or is still experimental. To the best of my knowledge, they have kept the promises from the lifecycle indicator.
3. I have been a full-time R and tidyverse user since dplyr was first released in 2014, and my personal experience is consistent with the two observations above. I agree with the parent commenter that the tidyverse API used to be unstable, but this was mainly in 2019 or earlier, before dplyr went to 1.0. And even back then, they were always honest about when the API might change. So now that the tidyverse maintainers are saying dplyr and other tidyverse packages are stable, I see no rational basis to doubt them.
4. Finally, even during the early unstable API period of the tidyverse, I personally did not find it such a great burden to upgrade my code as the tidyverse improved. It was actually quite thrilling to watch Hadley's vision develop and incrementally learn and use the concepts as he built them out. To use the tidyverse is to be part of something greater, part of the future, part of a new way of thinking that makes you a better data analyst.
IMHO, the functionality and ergonomics of the tidyverse are light-years ahead of any other* data frame package, and anyone who doesn't try it because of some negative anecdotes is missing out.
*No argument from me if you prefer data.table. It's got some performance advantages on large data and a different philosophy that may appeal more to some. Financial time series folks often prefer it. YMMV.
Thinking something is really great and how I want to see the future develop is not a religion or a quasi-religion. It's just my experience, one that I hope others can benefit from.
I think it's totally fine to use base R or data.table or whatever else you like. There is no one right way, and I have used all of these and more in different contexts. But if people are getting impressions of pros and cons from the discussion here on HN, they should be aware that claims of API instability are several years out of date. It would be a shame if people were scared about instability that isn't there.
You are entitled to your opinion (which I see in every thread that discusses the tidyverse), but in my opinion this is a considerable and outdated exaggeration which will mislead the less experienced. Let's balance it out with some different perspective.
1. The core of the tidyverse, the data manipulation package dplyr, reached version 1.0 in May 2020 and made a promise to keep the API stable from that point on. To the best of my knowledge, they have done so, and anyone who wants to verify this can look at the changelogs. That's nearly 3 years of stability.
2. For several years, functions in the tidyverse have had a lifecycle indicator that appears in the documentation. It tells you if the function has reached a mature state or is still experimental. To the best of my knowledge, they have kept the promises from the lifecycle indicator.
3. I have been a full-time R and tidyverse user since dplyr was first released in 2014, and my personal experience is consistent with the two observations above. I agree with the parent commenter that the tidyverse API used to be unstable, but this was mainly in 2019 or earlier, before dplyr went to 1.0. And even back then, they were always honest about when the API might change. So now that the tidyverse maintainers are saying dplyr and other tidyverse packages are stable, I see no rational basis to doubt them.
4. Finally, even during the early unstable API period of the tidyverse, I personally did not find it such a great burden to upgrade my code as the tidyverse improved. It was actually quite thrilling to watch Hadley's vision develop and incrementally learn and use the concepts as he built them out. To use the tidyverse is to be part of something greater, part of the future, part of a new way of thinking that makes you a better data analyst.
IMHO, the functionality and ergonomics of the tidyverse are light-years ahead of any other* data frame package, and anyone who doesn't try it because of some negative anecdotes is missing out.
*No argument from me if you prefer data.table. It's got some performance advantages on large data and a different philosophy that may appeal more to some. Financial time series folks often prefer it. YMMV.